Skip to main content

τ²-bench — AI model leaderboard

AI models ranked by τ²-bench, an aggregated third-party benchmark from artificial_analysis. Higher is better. Cross-referenced against our first-party meo scores and Effective Value (𝕍).

Ranking 78 models across the full field · as of 2026-06-07.

#ModelLabτ²-bench
1Z.ai: GLM 5 Turboz-ai98.5%
2Z.ai: GLM 5V Turboz-ai98.5%
3StepFun: Step 3.7 Flashstepfun98.5%
4xAI: Grok 4.3x-ai97.7%
5Z.ai: GLM 5.1z-ai97.7%
6Qwen: Qwen3.6 Plusqwen97.7%
7DeepSeek: DeepSeek V4 Prodeepseek96.2%
8Google: Gemini 3.1 Pro Previewgoogle95.6%
9Qwen: Qwen3.5 397B A17Bqwen95.6%
10Google: Gemini 3.5 Flashgoogle95.3%
11Qwen: Qwen3.6 35B A3Bqwen95.3%
12DeepSeek: DeepSeek V4 Flashdeepseek95.0%
13Qwen: Qwen3.7 Maxqwen94.7%
14Anthropic: Claude Opus 4.8anthropic94.4%
15Xiaomi: MiMo-V2.5-Proxiaomi94.2%
16Qwen: Qwen3.6 27Bqwen94.2%
17Mistral: Mistral Medium 3.5mistralai94.2%
18OpenAI: GPT-5.5openai93.9%
19Qwen: Qwen3.5-122B-A10Bqwen93.6%
20Qwen: Qwen3.7 Plusqwen93.0%
21Tencent: Hy3 previewtencent92.7%
22inclusionAI: Ring-2.6-1Tinclusionai92.4%
23OpenAI: GPT-5.2-Codexopenai92.1%
24Arcee AI: Trinity Large Thinkingarcee-ai90.1%
25inclusionAI: Ling-2.6-1Tinclusionai89.8%
26Kwaipilot: KAT-Coder-Pro V2kwaipilot89.5%
27MiniMax: MiniMax M3minimax88.9%
28Anthropic: Claude Opus 4.7anthropic88.6%
29StepFun: Step 3.5 Flashstepfun87.4%
30OpenAI: GPT-5.4openai87.1%
31OpenAI: GPT-5 Codexopenai86.8%
32Qwen: Qwen3.5-9Bqwen86.8%
33Upstage: Solar Pro 3upstage86.3%
34OpenAI: GPT-5.3-Codexopenai86.0%
35inclusionAI: Ling-2.6-flashinclusionai86.0%
36OpenAI: GPT-5.2 Chatopenai84.8%
37MiniMax: MiniMax M2.7minimax84.8%
38OpenAI: GPT-5openai84.8%
39Xiaomi: MiMo-V2-Flashxiaomi83.9%
40OpenAI: GPT-5.4 Miniopenai83.3%
41OpenAI: GPT-5.1-Codexopenai83.0%
42OpenAI: GPT-5.1openai81.9%
43OpenAI: o3openai80.7%
44Anthropic: Claude Sonnet 4.6anthropic79.5%
45Qwen: Qwen3 Coder Nextqwen79.5%
46OpenAI: GPT-5.4 Nanoopenai76.0%
47Inception: Mercury 2inception70.8%
48OpenAI: GPT-5 Miniopenai68.4%
49OpenAI: gpt-oss-120bopenai65.8%
50OpenAI: GPT-5.1-Codex-Miniopenai62.9%
51OpenAI: o1openai62.6%
52OpenAI: gpt-oss-20bopenai60.2%
53Google: Gemma 4 31Bgoogle59.9%
54OpenAI: o4 Miniopenai55.6%
55Google: Gemini 2.5 Progoogle54.1%
56OpenAI: GPT-4.1 Miniopenai52.9%
57OpenAI: GPT-4.1openai47.1%
58Google: Gemma 4 26B A4B (free)google43.6%
59Google: Gemini 3 Flash Previewgoogle43.3%
60OpenAI: GPT-5 Nanoopenai36.5%
61Google: Gemini 3.1 Flash Litegoogle31.3%
62OpenAI: o3 Mini Highopenai31.3%
63Google: Gemini 2.5 Flash Lite Preview 09-2025google30.4%
64OpenAI: GPT-4o (2024-08-06)openai28.9%
65OpenAI: o3 Miniopenai28.7%
66IBM: Granite 4.1 8Bibm-granite27.8%
67Prime Intellect: INTELLECT-3prime-intellect26.6%
68OpenAI: GPT-4oopenai25.1%
69Meta: Llama 4 Maverickmeta-llama17.8%
70OpenAI: GPT-4.1 Nanoopenai17.3%
71Meta: Llama 4 Scoutmeta-llama15.5%
72Cohere: Command Acohere15.2%
73Google: Gemini 2.5 Flashgoogle14.9%
74Google: Gemma 3 12Bgoogle10.8%
75Google: Gemma 3 27Bgoogle10.5%
76Microsoft: Phi 4 Mini Instructmicrosoft8.2%
77Microsoft: Phi 4microsoft0.0%
78Reka Flash 3rekaai0.0%

Artificial Analysis (artificialanalysis.ai). Redistribution requires an AA commercial license.

← All rankingsMethodology & 𝕍 →