Skip to main content

Humanity's Last Exam — AI model leaderboard

AI models ranked by Humanity's Last Exam, an aggregated third-party benchmark from artificial_analysis. Higher is better. Cross-referenced against our first-party meo scores and Effective Value (𝕍).

Ranking 82 models across the full field · as of 2026-06-07.

#ModelLabHumanity's Last Exam
1Anthropic: Claude Opus 4.8anthropic45.7%
2Google: Gemini 3.1 Pro Previewgoogle44.7%
3OpenAI: GPT-5.5openai44.3%
4OpenAI: GPT-5.4openai41.6%
5Google: Gemini 3.5 Flashgoogle41.0%
6OpenAI: GPT-5.3-Codexopenai39.9%
7Anthropic: Claude Opus 4.7anthropic39.6%
8Qwen: Qwen3.7 Maxqwen38.1%
9MiniMax: MiniMax M3minimax37.1%
10DeepSeek: DeepSeek V4 Prodeepseek35.9%
11OpenAI: GPT-5.2 Chatopenai35.4%
12xAI: Grok 4.3x-ai35.0%
13Xiaomi: MiMo-V2.5-Proxiaomi33.8%
14OpenAI: GPT-5.2-Codexopenai33.5%
15Qwen: Qwen3.7 Plusqwen33.4%
16DeepSeek: DeepSeek V4 Flashdeepseek32.1%
17MiniMax: MiniMax M2.7minimax28.1%
18Z.ai: GLM 5.1z-ai28.0%
19Qwen: Qwen3.5 397B A17Bqwen27.3%
20OpenAI: GPT-5.4 Miniopenai26.6%
21OpenAI: GPT-5.1openai26.5%
22OpenAI: GPT-5openai26.5%
23OpenAI: GPT-5.4 Nanoopenai26.5%
24Qwen: Qwen3.6 Plusqwen25.7%
25OpenAI: GPT-5 Codexopenai25.6%
26Tencent: Hy3 previewtencent25.5%
27Z.ai: GLM 5 Turboz-ai25.4%
28OpenAI: GPT-5.1-Codexopenai23.4%
29Qwen: Qwen3.5-122B-A10Bqwen23.4%
30Google: Gemma 4 31Bgoogle22.7%
31StepFun: Step 3.5 Flashstepfun22.6%
32Qwen: Qwen3.6 27Bqwen21.6%
33Google: Gemini 2.5 Progoogle21.1%
34Qwen: Qwen3.6 35B A3Bqwen20.2%
35OpenAI: o3openai20.0%
36StepFun: Step 3.7 Flashstepfun19.9%
37OpenAI: GPT-5 Miniopenai19.7%
38OpenAI: gpt-oss-120bopenai18.5%
39inclusionAI: Ring-2.6-1Tinclusionai18.3%
40Google: Gemma 4 26B A4B (free)google18.3%
41OpenAI: o4 Miniopenai17.5%
42OpenAI: GPT-5.1-Codex-Miniopenai16.9%
43Google: Gemini 3.1 Flash Litegoogle16.2%
44Kwaipilot: KAT-Coder-Pro V2kwaipilot16.0%
45Z.ai: GLM 5V Turboz-ai15.8%
46Inception: Mercury 2inception15.5%
47Arcee AI: Trinity Large Thinkingarcee-ai14.7%
48Google: Gemini 3 Flash Previewgoogle14.1%
49Qwen: Qwen3.5-9Bqwen13.3%
50Anthropic: Claude Sonnet 4.6anthropic13.2%
51Mistral: Mistral Medium 3.5mistralai12.8%
52OpenAI: o3 Mini Highopenai12.3%
53Prime Intellect: INTELLECT-3prime-intellect12.1%
54Upstage: Solar Pro 3upstage10.1%
55OpenAI: gpt-oss-20bopenai9.8%
56Qwen: Qwen3 Coder Nextqwen9.3%
57OpenAI: o3 Miniopenai8.7%
58inclusionAI: Ling-2.6-1Tinclusionai8.2%
59OpenAI: GPT-5 Nanoopenai8.2%
60Xiaomi: MiMo-V2-Flashxiaomi8.0%
61OpenAI: o1openai7.7%
62inclusionAI: Ling-2.6-flashinclusionai6.2%
63Google: Gemma 3 4Bgoogle5.2%
64Google: Gemini 2.5 Flashgoogle5.1%
65Reka Flash 3rekaai5.1%
66Meta: Llama 4 Maverickmeta-llama4.8%
67Google: Gemma 3 12Bgoogle4.8%
68Google: Gemma 3 27Bgoogle4.7%
69OpenAI: GPT-4.1openai4.6%
70Google: Gemini 2.5 Flash Lite Preview 09-2025google4.6%
71Cohere: Command Acohere4.6%
72OpenAI: GPT-4.1 Miniopenai4.6%
73Meta: Llama 4 Scoutmeta-llama4.3%
74Microsoft: Phi 4 Mini Instructmicrosoft4.2%
75Microsoft: Phi 4microsoft4.1%
76OpenAI: GPT-4o-miniopenai4.0%
77OpenAI: GPT-4.1 Nanoopenai3.9%
78IBM: Granite 4.1 8Bibm-granite3.8%
79OpenAI: GPT-4oopenai3.3%
80OpenAI: GPT-4 Turboopenai3.3%
81OpenAI: GPT-4o (2024-08-06)openai2.9%
82OpenAI: GPT-4o (2024-05-13)openai2.8%

Artificial Analysis (artificialanalysis.ai). Redistribution requires an AA commercial license.

← All rankingsMethodology & 𝕍 →