On the Humanity’s Last Exam (HLE) benchmark, Kimi K2.5 scored 50.2% (with tools), surpassing OpenAI’s GPT-5.2 (xhigh) and Claude Opus 4.5. It also achieved 76.8% on SWE-bench Verified, cementing its ...
At this year's Build conference, Microsoft unveiled a major expansion of its agent-based AI platform, highlighting new tools to securely build, customize and orchestrate intelligent agents across ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results