AssemblyAI Release Notes
Last updated: Apr 2, 2026
- Apr 2, 2026
- Date parsed from source:Apr 2, 2026
- First seen by Releasebot:Apr 2, 2026
Universal-2 Language Improvements: Hebrew & Swedish
AssemblyAI improves Universal-2 transcription accuracy for Hebrew and Swedish with automatic live updates for all users.
Universal-2 transcription accuracy has improved significantly for Hebrew and Swedish, with word error rates reduced by 37% and 47% respectively. No changes to your integration required — the improvements are live automatically for all users.
AssemblyAI's Universal speech model delivers industry-leading accuracy across dozens of languages, with continuous improvements rolling out automatically.
Original source Report a problem - Mar 25, 2026
- Date parsed from source:Mar 25, 2026
- First seen by Releasebot:Apr 2, 2026
LLM Gateway: Automatic Model Fallbacks
AssemblyAI adds automatic model fallbacks to LLM Gateway, bringing built-in resilience against model failures without changing integrations. The public beta lets users retry with fallback models or the same model after 500ms, with configurable fallback depth and retry behavior.
LLM Gateway now supports automatic model fallbacks, giving your application resilience against model failures without changing your integration. If a model returns a server error, the Gateway will automatically retry with a fallback — or retry the same model after 500ms by default.
This is available now in Public Beta for all LLM Gateway users.
How to use it
Add a
fallbacksarray and optionalfallback_configto your request. All fields from the original request are copied over to the fallback automatically — you only need to specify what you want to override.Simple fallback — fall back to a different model, inheriting all original parameters:
{ "model": "kimi-k2.5", "messages": [ { "role": "user", "content": "Summarize this meeting: ..." } ], "temperature": 0.2, "fallbacks": [ { "model": "claude-sonnet-4-6" } ] }Advanced fallback — override specific parameters when falling back (e.g., different prompt or temperature for a different model's behavior):
{ "model": "kimi-k2.5", "messages": [ { "role": "user", "content": "Summarize this meeting: ..." } ], "temperature": 0.2, "fallbacks": [ { "model": "claude-sonnet-4-6", "messages": [ { "role": "user", "content": "Summarize this meeting concisely, key info only: ..." } ], "temperature": 0.3 } ] }Fallback config options:
"fallback_config": { "depth": 1, "retry": true }By default, if no fallbacks are set, the API will automatically retry a failed request after 500ms. For more control, set
fallback_config.retryto false and implement your own exponential backoff.AssemblyAI's LLM Gateway gives you a single API to access leading models from every major provider — with built-in resilience, load balancing, and cost tracking.
Original source Report a problem All of your release notes in one feed
Join Releasebot and get updates from AssemblyAI and hundreds of other software products.
- Mar 25, 2026
- Date parsed from source:Mar 25, 2026
- First seen by Releasebot:Mar 26, 2026
Introducing Medical Mode: Purpose-built accuracy for medical terminology
AssemblyAI adds Medical Mode for Streaming Speech-to-Text, boosting accuracy for medical terminology like medications, procedures, conditions, and dosages. The new add-on is available now across Universal Streaming models and works with no pipeline changes.
Medical Mode is a new add-on for AssemblyAI's Streaming Speech-to-Text that improves transcription accuracy for medical terminology — including medication names, procedures, conditions, and dosages. Available now on Universal-3 RT Pro, Universal Streaming English, and Universal Streaming Multilingual.
What it does
Medical Mode applies a correction pass optimized for medical entity recognition, targeting terms that general-purpose ASR frequently gets wrong. It works alongside the base model's noise handling, accent robustness, and latency characteristics — no tradeoffs.
Why it exists
General-purpose ASR can achieve strong overall accuracy on clinical audio while still consistently misrecognizing medical terminology. Because most healthcare AI pipelines feed transcripts directly into LLMs for structured output generation — SOAP notes, discharge summaries, referral letters — transcription errors on medical entities propagate rather than attenuate. Medical Mode intercepts those errors before they enter the pipeline.
How to enable it
Set the
domain
connection parameter to
"medical-v1"
. No other changes to your existing pipeline are required.
Availability & pricing
- Available now on Universal-3, Universal-3 Pro Streaming, Universal Streaming English, and Universal Streaming Multilingual
- Supports English, Spanish, German, and French
- Billed as a separate add-on — see the
pricing page
for details
- HIPAA BAA, SOC 2 Type 2, ISO 27001:2022, PCI DSS v4.0 included
Resources
-
Learn more- Medical Mode docs
- Playground — test with your own audio
- Mar 17, 2026
- Date parsed from source:Mar 17, 2026
- First seen by Releasebot:Mar 25, 2026
AssemblyAI Skill for AI Coding Agents
AssemblyAI adds a new Skill for AI coding agents, giving Claude Code, Cursor, Codex, and other tools accurate, up-to-date guidance on its APIs, SDKs, and integrations. It helps agents build with current transcription, streaming, LLM Gateway, and voice agent workflows correctly.
The AssemblyAI Skill is now available for AI coding agents — giving Claude Code, Cursor, Codex, and other vibe-coding tools accurate, up-to-date knowledge of AssemblyAI's APIs, SDKs, and integrations out of the box.
LLM training data goes stale fast. Without the skill, coding agents default to deprecated AssemblyAI patterns: the old LeMUR API instead of the LLM Gateway, wrong auth headers, discontinued SDK usage, and no awareness of newer features like Universal-3 Pro Streaming or the voice agent framework integrations. The AssemblyAI Skill corrects all of that — and covers the full current API surface, from pre-recorded transcription to real-time streaming to LLM Gateway workflows.
In evals, agents using the skill scored 17/17 on correctness across transcription, voice agent, and LLM Gateway scenarios. Without it: 7/17. The biggest gains are in voice agent integrations and LLM Gateway usage, where agents otherwise have no training data for framework-specific patterns.
How to use it
- Install via Claude Code:
cp -r assemblyai ~/.claude/skills/for personal use, or
cp -r assemblyai .claude/skills/at the project level
- For Codex, copy the folder and reference
assemblyai/SKILL.mdin your
AGENTS.md- Cursor and Windsurf: add the
assemblyai/directory as project-level documentation
- Available now — free, open source, no API key required
AssemblyAI is the leading speech AI platform for developers — built for production with best-in-class accuracy, real-time streaming, and a full suite of audio intelligence features. The AssemblyAI Skill makes sure your coding agent builds with all of it correctly, every time.
Original source Report a problem - Mar 11, 2026
- Date parsed from source:Mar 11, 2026
- First seen by Releasebot:Mar 12, 2026
LLM Gateway Now Available in the EU
AssemblyAI announces EU availability of LLM Gateway and Speech Understanding with full data residency. EU customers can run LLM inference while prompts and responses stay in the region, with Claude and Gemini models supported. A unified API for LLM and audio ships now with enterprise reliability and transparent pricing. EU endpoint available now.
LLM Gateway and Speech Understanding in the EU
LLM Gateway and Speech Understanding are now available in the EU. Customers can now run LLM inference with full EU data residency, opening the door for teams with strict data governance requirements—including those migrating from LeMUR.
EU regional availability means your prompts and responses never leave the European Union. This is especially valuable for healthcare, finance, and enterprise customers with compliance requirements. Currently, Claude and Gemini regional models are supported in the EU.
How to use it:
- Update your request URL to the EU endpoint:
https://llm-gateway.eu.assemblyai.com/v1/chat/completions - Available now for all customers — no beta access required
- See the Cloud Endpoints & Data Residency docs for full details
LLM Gateway gives you a single, unified API to run LLM inference and audio intelligence together — with enterprise-grade reliability, transparent pricing, and now the data residency controls your team requires.
Try LLM Gateway in the EU →
Original source Report a problem - Mar 3, 2026
- Date parsed from source:Mar 3, 2026
- First seen by Releasebot:Mar 4, 2026
Universal-3-Pro Now Available for Streaming
Universal-3-Pro is now available for real-time streaming, delivering best-in-class accuracy with real-time speaker labeling and native code switching for multilingual transcription. Build real-time voice apps with LLM-style accuracy and low-latency streaming via AssemblyAI.
Universal-3-Pro streaming
Universal-3-Pro is now available for real-time streaming — bringing our most accurate speech model to live transcription for the first time. Developers building voice agents, live captioning tools, and real-time analytics pipelines can now combine Universal-3-Pro's state-of-the-art accuracy with the low latency of AssemblyAI's streaming API.
Universal-3-Pro streaming delivers three key capabilities that set it apart: best-in-class word error rates across streaming ASR benchmarks, real-time speaker labels to identify who is speaking at each turn, and superior entity detection for names, places, organizations, and specialized terminology — all in real time, not just in batch. And with built-in code switching, Universal-3-Pro handles multilingual audio natively, accurately transcribing speakers who move between languages mid-conversation.
Whether you're building voice agents that need to route conversations by speaker, transcription tools that must catch rare entities accurately, or global applications serving multilingual users, Universal-3-Pro for streaming gives you LLM-style accuracy at real-time speeds.
How to use it:
Set "speech_model": "universal-3-pro" in your WebSocket connection parameters
Code switching is enabled automatically — no additional configuration needed
Available now via the streaming endpoint for all users
Read the full documentation
AssemblyAI's Universal-Streaming API is the fastest way to build real-time voice applications — and with Universal-3-Pro, it's now the most accurate too.
Original source Report a problem - Feb 26, 2026
- Date parsed from source:Feb 26, 2026
- First seen by Releasebot:Mar 3, 2026
Share Your Playground Transcripts
AssemblyAI Playground gains a share button that generates a live 90‑day link to your transcript output for easy collaboration. Share results in Slack or with teammates and clients without copy‑paste or exports. It’s the fastest no‑code way to test transcription and audio intelligence models with instant sharing.
The AssemblyAI Playground now has a share button. One click generates a shareable link to your transcript output that stays live for 90 days.
Whether you're dropping results into a Slack thread, looping in a teammate for a quick review, or showing a client what the output actually looks like before they integrate — you no longer need to copy-paste text or export anything. Just hit share and send the link.
The AssemblyAI Playground is the fastest way to test our transcription and audio intelligence models without writing a single line of code. Try different models, toggle features, and now share what you see instantly.
Original source Report a problem - Feb 19, 2026
- Date parsed from source:Feb 19, 2026
- First seen by Releasebot:Feb 20, 2026
Claude Sonnet 4.6 now supported on LLM Gateway
Claude Sonnet 4.6 release
Claude Sonnet 4.6 is now available through LLM Gateway. Sonnet 4.6 is our most capable Sonnet model yet with frontier performance across coding, agents, and professional work at scale. With this model, every line of code, every agent task, every spreadsheet can be powered by near-Opus intelligence at Sonnet pricing.hnm
To use it, update the model parameter to claude-sonnet-4-6 in your LLM Gateway requests.
For more information, check out our docs here.
Original source Report a problem - Feb 9, 2026
- Date parsed from source:Feb 9, 2026
- First seen by Releasebot:Feb 20, 2026
Claude Opus 4.5 and 4.6 now supported on LLM Gateway
Claude's Opus Models via LLM Gateway
Claude's most capable models are now available through LLM Gateway. Opus 4.5 and Opus 4.6 bring significant improvements in reasoning, coding, and instruction-following.
To use it, update the model parameter to
claude-opus-4-5-20250929orclaude-opus-4-6in your LLM Gateway requests.For more information, check out our docs here.
Original source Report a problemclaude-opus-4-5-20250929 claude-opus-4-6 - Feb 3, 2026
- Date parsed from source:Feb 3, 2026
- First seen by Releasebot:Feb 9, 2026
Universal-3-Pro: Our Promptable Speech-to-Text Model
Universal-3-Pro launches as the most capable Voice AI model yet, delivering LLM style control over transcription. Users can steer outputs with natural prompts and keyterms, even in six languages, with verbatim options and non speech tagging. Available now via the /v2/transcript API.
Universal-3-Pro release
We've released Universal-3-Pro, our most powerful Voice AI model yet—designed to give you LLM-style control over transcription output for the first time.
Unlike traditional ASR models that limit you to basic keyterm prompting or fixed output styles, Universal-3-Pro lets you progressively layer instructions to steer transcription behavior. Need verbatim output with filler words? Medical terminology with accurate dosages? Speaker labels by role? Code-switching between English and Spanish? You can design one robust prompt and apply it consistently across thousands of calls, getting workflow-ready outputs instead of brittle workarounds.
Out of the box, Universal-3-Pro outperforms all ASR models on accuracy, especially for entities and rare words. But the real power is in the prompting: natural language prompts up to 1,500 words for context and style, keyterms prompting for up to 1,000 specialized terms, built-in code switching across 6 languages, verbatim transcription controls for disfluencies and stutters, and audio tags for non-speech events like laughter, music, and beeps.How to use it:
- Set "speech_models": ["universal-3-pro", "universal"] with "language_detection": true for automatic routing and 99-language coverage
- Use prompt for natural language instructions and keyterms_prompt for boosting rare words (up to 1,000 terms, 6 words each)
- Available now via the /v2/transcript endpoint
- Read the full documentation
Universal-3-Pro represents a fundamental shift in what's possible with speech-to-text: true controllability that rivals human transcription quality, with the consistency and scale of an API.
Original source Report a problem
Try Universal-3-Pro →