Rankings
Arena.ai leaderboards: Elo op basis van blind head-to-head stemmen. Kies een categorie.
Arena.ai leaderboards: Elo op basis van blind head-to-head stemmen. Kies een categorie.
Zoeken, grounding & RAG.
28 mei 2026, 19:45
29 modellen
| # | Model | Elo |
|---|---|---|
claude-opus-4-6-search Anthropic 路 Proprietary 卤6 路 48.7k stemmen | 1251 | |
gpt-5.5-search OpenAI 路 Proprietary 卤8 路 9.5k stemmen | 1239 | |
claude-opus-4-7 Anthropic 路 Proprietary 卤8 路 9.7k stemmen | 1237 | |
| 4 | ernie-5.1 Baidu 路 Proprietary 卤12 路 2.3k stemmen | 1226 |
| 5 | claude-sonnet-4-6-search Anthropic 路 Proprietary 卤6 路 48.1k stemmen | 1219 |
| 6 | gemini-3.1-pro-grounding Google 路 Proprietary 卤7 路 28.0k stemmen | 1216 |
| 7 | gpt-5.2-search OpenAI 路 Proprietary 卤6 路 47.1k stemmen | 1210 |
| 8 | grok-4.20-multi-agent-beta-0309 xAI 路 Proprietary 卤7 路 27.5k stemmen | 1209 |
| 9 | gemini-3-pro-grounding Google 路 Proprietary 卤5 路 37.3k stemmen | 1208 |
| 10 | gemini-3-flash-grounding Google 路 Proprietary 卤6 路 62.9k stemmen | 1206 |
| 11 | gpt-5.1-search OpenAI 路 Proprietary 卤6 路 53.7k stemmen | 1199 |
| 12 | gpt-5.4-search OpenAI 路 Proprietary 卤7 路 27.9k stemmen | 1199 |
| 13 | grok-4.20-beta1 xAI 路 Proprietary 卤6 路 49.0k stemmen | 1193 |
| 14 | grok-4.3 xAI 路 Proprietary 卤8 路 6.9k stemmen | 1189 |
| 15 | claude-opus-4-5-search Anthropic 路 Proprietary 卤6 路 53.4k stemmen | 1182 |
Elo-score op basis van blind head-to-head stemmen. Hoger is beter. 卤 is het 95% betrouwbaarheidsinterval. Zelfde formaat als Arena.ai.
Bron op Arena.ai