Rankings
Arena.ai leaderboards: Elo op basis van blind head-to-head stemmen. Kies een categorie.
Arena.ai leaderboards: Elo op basis van blind head-to-head stemmen. Kies een categorie.
PDF- en documentbegrip.
28 mei 2026, 19:45
24 modellen
| # | Model | Elo |
|---|---|---|
claude-opus-4-6-thinking Anthropic 路 Proprietary 卤8 路 11.9k stemmen | 1522 | |
claude-opus-4-6 Anthropic 路 Proprietary 卤7 路 20.4k stemmen | 1513 | |
claude-opus-4-7 Anthropic 路 Proprietary 卤8 路 6.7k stemmen | 1510 | |
| 4 | claude-opus-4-7-thinking Anthropic 路 Proprietary 卤8 路 6.4k stemmen | 1509 |
| 5 | gpt-5.5-high OpenAI 路 Proprietary 卤9 路 4.6k stemmen | 1496 |
| 6 | claude-sonnet-4-6 Anthropic 路 Proprietary 卤6 路 31.9k stemmen | 1495 |
| 7 | gpt-5.5 OpenAI 路 Proprietary 卤9 路 4.7k stemmen | 1492 |
| 8 | gpt-5.4 OpenAI 路 Proprietary 卤7 路 14.4k stemmen | 1474 |
| 9 | claude-opus-4-5-20251101 Anthropic 路 Proprietary 卤10 路 8.0k stemmen | 1466 |
| 10 | kimi-k2.6 Moonshot 路 Open weights 卤10 路 3.8k stemmen | 1454 |
| 11 | muse-spark Meta 路 Proprietary 卤19 路 868 stemmen | 1452 |
| 12 | claude-sonnet-4-5-20250929 Anthropic 路 Proprietary 卤7 路 16.7k stemmen | 1450 |
| 13 | gemini-3.1-pro-preview Google 路 Proprietary 卤6 路 24.9k stemmen | 1443 |
| 14 | gemini-3-pro Google 路 Proprietary 卤9 路 10.8k stemmen | 1439 |
| 15 | kimi-k2.5-thinking Moonshot 路 Open weights 卤8 路 10.5k stemmen | 1437 |
Elo-score op basis van blind head-to-head stemmen. Hoger is beter. 卤 is het 95% betrouwbaarheidsinterval. Zelfde formaat als Arena.ai.
Bron op Arena.ai