Skip to content

steel-dev/leaderboard

Repository files navigation

Browser Agent Leaderboard

This repository presents the current standings of various web agents evaluated on the WebVoyager benchmark (paper). The WebVoyager benchmark comprises 643 tasks across 15 popular websites, assessing agents' abilities to perform diverse web navigation and interaction tasks.


Steel.dev - Open-source Browser API for AI Agents & Apps Steel is an open-source browser API purpose-built for AI agents.

Leaderboard

Rank Agent Organization WebVoyager Score Source Open Source New SOTA
1 Magnitude Magnitude 93.9% Source Yes Yes Yes
2 Browser Use Browser Use 89.1% Source Yes Yes
3 Operator OpenAI 87% Source No Yes
4 Kura Kura 87% Source No Yes
5 Skyvern 2.0 Skyvern 85.85% Source Yes Yes
6 Project Mariner Google 83.5% Source No
7 Proxy Convergence AI 82% Source No
8 Agent-E Emergence AI 73.1% Source No
9 Runner H 0.1 H Company 67% Source No
10 WILBUR Academic Research 60.6% Source No
11 WebVoyager Academic Research 59.1% Source Yes
12 Computer Use Anthropic 52% Source No

Notes:

  • Open Source: Indicates whether the agent's source code is publicly available.
  • New: Denotes recently introduced agents.
  • SOTA: Signifies agents that have achieved state-of-the-art performance.

Contributing

We encourage contributions to keep this leaderboard up-to-date. If you have information about new agents or updated scores, please submit a pull request or open an issue.

License

This project is licensed under the MIT License.

About

Open leaderboard for browser agents

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •