Browserbase Products
All Browserbase Release Notes
- Dec 22, 2025
- Parsed from source:Dec 22, 2025
- Detected by Releasebot:Dec 22, 2025
Browserbase enables agentic payments with Coinbase and x402
Browserbase partners with Coinbase to enable frictionless agent-to-browser automation with USDC payments via x402. Agents can spin up real browsers to navigate live sites and perform autonomous tasks without credentials or human approval.
Announcement
We’re excited to announce that Browserbase, in partnership with Coinbase, now enables frictionless agent-to-browser automation, with no API keys, no billing setup, and instant USDC settlement.
As LLMs become the primary way users and developers discover new services, platforms must be legible and usable by agents from the first interaction. Any LLM can now discover Browserbase through the x402 Bazaar, understand our pricing, and pay per browser session, all programmatically.
This allows agents to spin up a real browser, navigate live websites, interact with UI elements, submit forms, and extract data as part of an autonomous workflow, without human approval or credential management.
Leveraging Coinbase business: accepting USDC via x402 with accounting, tracking, and yield optimization
To receive payment for each session, Browserbase uses Coinbase Business as its settlement layer. This means that every x402 call instantly deposits USDC into our Coinbase account, providing:
- automatic yield on idle balances
- institutional-grade custody
- real-time visibility into usage and revenue
- no reconciliation work, no fraud risk, and no credit cards to manage
This setup allows Browserbase to focus on delivering great browser automation rather than managing a billing system.
Try it out
Developers and agent builders can start experimenting today with the x402 Endpoint Documentation here.
Read the full announcement on Coinbase's Developer Platform here.
With Browserbase and x402, agents can now use the web the way it actually exists, by navigating real pages, interacting with real interfaces, and paying for compute autonomously.
If you’re building agents, automation systems, or AI-native products that require real browser interaction, this unlocks a new class of workflows that were previously impossible without human setup.
Original source Report a problem - December 2025
- No date parsed from source.
- Detected by Releasebot:Dec 25, 2025
1.13.1
Patch Changes
- #509 a7d345e Thanks @miguelg719 ! - Bun runs will now throw a more informed error
- December 2025
- No date parsed from source.
- Detected by Releasebot:Dec 25, 2025
- Modified by Releasebot:Dec 30, 2025
1.13.0
Unreleased release notes bundle feature and patch updates focused on observability, browser config, and reliability. Highlights include Stagehand API offloading, LocalBrowserLaunchOptions, iframe inclusion in observe results, captcha solving toggle, payload fixes, and accessibility enhancements.
Minor Changes
- #486 33f2b3f Thanks @sameelarif ! - [Unreleased] Parameterized offloading Stagehand method calls to the Stagehand API. In the future, this will allow for better observability and debugging experience.
- #494 9ba4b0b Thanks @pkiv ! - Added LocalBrowserLaunchOptions to provide comprehensive configuration options for local browser instances. Deprecated the top-level headless option in favor of using localBrowserLaunchOptions.headless
- #500 a683fab Thanks @miguelg719 ! - Including Iframes in ObserveResults. This appends any iframe(s) found in the page to the end of observe results on any observe call.
- #504 577662e Thanks @sameelarif ! - Enabled support for Browserbase captcha solving after page navigations. This can be enabled with the new constructor parameter: waitForCaptchaSolves .
- #496 28ca9fb Thanks @sameelarif ! - Fixed browserbaseSessionCreateParams not being passed in to the API initialization payload.
Patch Changes
- #459 62a29ee Thanks @seanmcguire12 ! - create a11y + dom hybrid input for observe
- #463 e40bf6f Thanks @seanmcguire12 ! - include 'Scrollable' annotations in a11y-dom hybrid
- #480 4c07c44 Thanks @miguelg719 ! - Adding a fallback try on actFromObserveResult to use the description from observe and call regular act.
- #487 2c855cf Thanks @seanmcguire12 ! - update refine extraction prompt to ensure correct schema is used
- #497 945ed04 Thanks @kamath ! - add gpt 4o november snapshot
- December 2025
- No date parsed from source.
- Detected by Releasebot:Dec 28, 2025
1.12.0
Observe now returns a suggested playwright method with necessary arguments and faster a11y tree processing. Minor changes add o3-mini to available model; patches improve accessibility tree, element handling, and deprecation guidance.
Minor Changes
- #426 bbbcee7 Thanks @miguelg719 ! - Observe got a major upgrade. Now it will return a suggested playwright method with any necessary arguments for the generated candidate elements. It also includes a major speedup when using a11y tree processing for context.
- #452 16837ec Thanks @kamath ! - add o3-mini to availablemodel
- #441 1032d7d Thanks @seanmcguire12 ! - allow act to accept observe output
Patch Changes
- #458 da2e5d1 Thanks @miguelg719 ! - Updated getAccessibilityTree() to make sure it doesn't skip useful nodes. Improved getXPathByResolvedObjectId() to account for text nodes and not skip generation
- #448 b216072 Thanks @seanmcguire12 ! - improve handling of radio button clicks
- #445 5bc514f Thanks @miguelg719 ! - Adding back useAccessibilityTree param to observe with a deprecation warning/error indicating to use onlyVisible instead
- December 2025
- No date parsed from source.
- Detected by Releasebot:Dec 28, 2025
1.11.0
Minor Changes
- #428 5efeb5a Thanks @seanmcguire12 ! - temporarily remove vision
- December 2025
- No date parsed from source.
- Detected by Releasebot:Dec 28, 2025
1.10.1
Patch Changes
- #422 a2878d0 Thanks @miguelg719 ! - Fixing a build type error for async functions being called inside evaulate for observeHandler.
- Nov 24, 2025
- Parsed from source:Nov 24, 2025
- Detected by Releasebot:Nov 25, 2025
Evaluating computer use models with Microsoft
Microsoft unveils Fara-7B, a tiny, fast web-capable VLM that delivers standout WebVoyager results and debuts via Browserbase with open weights on HuggingFace and Azure Foundry. Available today through Stagehand and Azure Foundry for researchers and developers.
The variety of models capable of interacting with the internet to automate real tasks continues to expand. Today, Microsoft is announcing Fara-7B, a small computer-use vision language model (VLM) that achieves new state-of-the-art results for its size.
Training and evaluating browser-based agents at scale requires reliable access to real websites, high concurrency, and consistent execution environments. Meeting these requirements directly impacts both reinforcement learning workflows and the accuracy of model evaluation.
That is why Microsoft has exclusively partnered with Browserbase to train, evaluate, and deploy their next generation of open-source computer-use models. As Corby Rosset, Principal AI Researcher at Microsoft, put it:
“Without Browserbase, it would not have been possible to seamlessly train our model with reliable access to real-world websites.”
Building upon our existing work with Deepmind, we were able to integrate, host, and independently evaluate Fara-7B on Browserbase within a week to provide fair and reliable benchmarking
The results were surprising… a good surprise.Introducing Fara-7B - a small but mighty model
Fara-7B delivers unmatched speed and cost efficiency for its size, with meaningful real-world capability. When benchmarked on the publicly available 595-task WebVoyager dataset updated in collaboration with Microsoft and using human-verified scoring, Fara-7B significantly outperformed similar open-source models.
Here are the results (pass@1 with up to 5 retries):
Small computer-use models like Fara-7B meaningfully change the unit economics of browser agents. Sub-second inference and dramatically lower compute costs enable running multiple agents in parallel, unlocking workflows that simply are not feasible with large proprietary models.Browserbase evaluation criteria
To ensure fairness, we evaluated 7B-size models only, using Microsoft’s updated version of Web Voyager.
We used pass@1 with up to 5 retries, reflecting real-world deployment patterns for small self-hosted models. All evaluations ran on Browserbase’s deterministic browser infrastructure, ensuring consistent environments without variability from rate limits, random states, or anti-bot checks. Every task was human-verified, with evaluators reviewing screenshots, full web trajectories, and the final task state to confirm completion. We measured accuracy, cost, and speed across all 595 tasks.
We will continue expanding this evaluation approach, including re-running Gemini, OpenAI CUA, and other frontier models using the same retry-friendly methodology to make comparisons fully consistent.Why independent evaluation matters
Evaluation of computer-use models across the industry is fundamentally inconsistent, which is why UI-Tars 1.5-7B performed far worse than its published numbers.
Here is what we consistently see:- LLM judging is unreliable. Different evaluators use different judge models, thresholds, prompts, or exclude failures.
- Websites change. Benchmarks using live websites break, get redesigned, rate-limit, or show entirely different flows.
- Strictness varies widely. One judge model passes an action, another flags it as incorrect.
When we evaluated UI-Tars, we found that many tasks reported as “successful” were incomplete or incorrectly executed, including cases where an LLM judge passed trajectories that a human evaluator would not consider correct.
This is exactly the motivation behind the Stagehand Evals CLI and and our broader benchmarking work, which allows you to run real models on real websites using reliable browser infrastructure that eliminates site-level noise, provides fully observable sessions with screenshots and logs, and incorporates human verification to catch what automated judges miss.
Standards produce reliable computer-use model evaluation, and reliability builds trust. That trust is why leading research teams like Microsoft and DeepMind partner with us to benchmark their models and make results reproducible for the community.
Open source and available today
Fara-7B is available today, with full open weights on HuggingFace and Azure AI Foundry, and Stagehand has built-in support for it out of the box. You can deploy it via:
- Azure Foundry (hosted directly by Microsoft)
- Fireworks, which made running both models effortless and supported fair inference comparisons
You can try it out on Stagehand by running:
# TypeScript npx create-browser-app --template microsoft-cuaHuge thanks to the Fireworks team for fast hosting and model support.
Calling all researchers
We are excited about the next generation of small, fast, open models, and equally committed to building the infrastructure required to evaluate them correctly.
This collaboration highlights a broader shift in how computer-use models are being built. The most successful approaches tend to share several traits: training on the real web instead of synthetic environments, grounding models through interaction and feedback, optimizing for practical deployment rather than benchmark-only performance, and maintaining transparency through shared trajectories and verification.
If you are training or fine-tuning a computer-use model, reach out.
We would love to help evaluate your model, validate it with human verification, and share feedback that helps move the ecosystem forward.Footnotes
- Fara 7B - Webvoyager
- UI Tars 1.5 7B - Webvoyager
- November 2025
- No date parsed from source.
- Detected by Releasebot:Nov 1, 2025
3.0.0
Speed boost 20-40% across act, extract, and observe with automatic caching and multi‑browser support. Built‑in primitives like page, locator, frameLocator and deepLocator, bun compatibility, and streamlined schemas. Targeted extract across iframes and shadow roots; migration guide advised.
Major Changes
- Removes internal Playwright dependency
- A generous 20-40% speed increase across act, extract, & observe calls
- Compatibility with Playwright, Puppeteer, and Patchright
- Automatic action caching (agent, stagehand.act). Go from CUA → deterministic scripts w/o inference
- A suite of non AI primitives:
- page
- locator (built in closed mode shadow root traversal, with xpaths & css selectors)
- frameLocator
- deepLocator (crosses iframes & shadow roots)
- bun compatibility
- Simplified extract schemas
- CSS selector support (id-based support coming soon)
- Targeted extract and observe across iframes & shadow roots
- More intuitive type names (observeResult is now action, act accepts an instruction string instead of an action string, solidified ModelConfiguration)
Check the migration guide for more information
Original source Report a problem - November 2025
- No date parsed from source.
- Detected by Releasebot:Nov 1, 2025
2.5.0
New release adds stagehand.agent support for MCP servers and custom tools, plus patches for webvoyager evals, local MCP server connections, configurable base URLs for OpenAI provider and CUA, and GPT-5 support in the operator agent.
Minor Changes
- #981 8244ab2 Thanks @sameelarif ! - Added support for stagehand.agent to interact with MCP servers as well as custom tools to be passed in. For more information, reference the MCP integrations documentation
Patch Changes
- #959 09b5e1e Thanks @filip-michalsky ! - add webvoyager evals
- #1049 e3734b9 Thanks @miguelg719 ! - Support local MCP server connections
- #1025 be85b19 Thanks @tkattkat ! - add support for custom baseUrl within openai provider
- #1040 88d1565 Thanks @miguelg719 ! - Allow OpenAI CUA to take in an optional baseURL
- #1046 ab5d6ed Thanks @tkattkat ! - Add support for gpt-5 in operator agent
- November 2025
- No date parsed from source.
- Detected by Releasebot:Nov 1, 2025
2.4.4
Patch Changes
- #1012 9e8c173 Thanks @miguelg719 ! - Fix disabling api validation whenever a customLLM client is provided