Post
1397
Benchmarked Kimi K2. It has done well. DeepSeek V3 beats it but Kimi K2 might be more skilled.
Very close performance to Qwen 3 in terms of skills and human alignment. But huge parameter count (1T!).
Full leaderboard https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08
Very close performance to Qwen 3 in terms of skills and human alignment. But huge parameter count (1T!).
Full leaderboard https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08