AssemblyAI Release Notes
Last updated: Feb 20, 2026
- Feb 19, 2026
- Date parsed from source:Feb 19, 2026
- First seen by Releasebot:Feb 20, 2026
Claude Sonnet 4.6 now supported on LLM Gateway
Claude Sonnet 4.6 release
Claude Sonnet 4.6 is now available through LLM Gateway. Sonnet 4.6 is our most capable Sonnet model yet with frontier performance across coding, agents, and professional work at scale. With this model, every line of code, every agent task, every spreadsheet can be powered by near-Opus intelligence at Sonnet pricing.hnm
To use it, update the model parameter to claude-sonnet-4-6 in your LLM Gateway requests.
For more information, check out our docs here.
Original source Report a problem - Feb 9, 2026
- Date parsed from source:Feb 9, 2026
- First seen by Releasebot:Feb 20, 2026
Claude Opus 4.5 and 4.6 now supported on LLM Gateway
Claude's Opus Models via LLM Gateway
Claude's most capable models are now available through LLM Gateway. Opus 4.5 and Opus 4.6 bring significant improvements in reasoning, coding, and instruction-following.
To use it, update the model parameter to
claude-opus-4-5-20250929orclaude-opus-4-6in your LLM Gateway requests.For more information, check out our docs here.
Original source Report a problemclaude-opus-4-5-20250929 claude-opus-4-6 All of your release notes in one feed
Join Releasebot and get updates from AssemblyAI and hundreds of other software products.
- Feb 3, 2026
- Date parsed from source:Feb 3, 2026
- First seen by Releasebot:Feb 9, 2026
Universal-3-Pro: Our Promptable Speech-to-Text Model
Universal-3-Pro launches as the most capable Voice AI model yet, delivering LLM style control over transcription. Users can steer outputs with natural prompts and keyterms, even in six languages, with verbatim options and non speech tagging. Available now via the /v2/transcript API.
Universal-3-Pro release
We've released Universal-3-Pro, our most powerful Voice AI model yet—designed to give you LLM-style control over transcription output for the first time.
Unlike traditional ASR models that limit you to basic keyterm prompting or fixed output styles, Universal-3-Pro lets you progressively layer instructions to steer transcription behavior. Need verbatim output with filler words? Medical terminology with accurate dosages? Speaker labels by role? Code-switching between English and Spanish? You can design one robust prompt and apply it consistently across thousands of calls, getting workflow-ready outputs instead of brittle workarounds.
Out of the box, Universal-3-Pro outperforms all ASR models on accuracy, especially for entities and rare words. But the real power is in the prompting: natural language prompts up to 1,500 words for context and style, keyterms prompting for up to 1,000 specialized terms, built-in code switching across 6 languages, verbatim transcription controls for disfluencies and stutters, and audio tags for non-speech events like laughter, music, and beeps.How to use it:
- Set "speech_models": ["universal-3-pro", "universal"] with "language_detection": true for automatic routing and 99-language coverage
- Use prompt for natural language instructions and keyterms_prompt for boosting rare words (up to 1,000 terms, 6 words each)
- Available now via the /v2/transcript endpoint
- Read the full documentation
Universal-3-Pro represents a fundamental shift in what's possible with speech-to-text: true controllability that rivals human transcription quality, with the consistency and scale of an API.
Original source Report a problem
Try Universal-3-Pro → - Jan 28, 2026
- Date parsed from source:Jan 28, 2026
- First seen by Releasebot:Jan 29, 2026
Improved Speaker Diarization for Short Audio
Speaker diarization is now more accurate for audio files under 2 minutes, with a 19% improvement in speaker count prediction and 6% improvement in cpWER. No changes required—this improvement is live for all users automatically.
Original source Report a problem - Jan 20, 2026
- Date parsed from source:Jan 20, 2026
- First seen by Releasebot:Jan 29, 2026
Global Edge Routing & Data Zone Endpoints for Streaming Speech-to-Text
New streaming endpoints give you control over latency and data residency. Edge Routing slashes latency by routing to nearest region, while Data Zone Routing keeps audio data inside US or EU. Update your WebSocket URL to switch, with the default endpoint unchanged.
Streaming endpoints overview
We've launched new streaming endpoints that give you control over latency optimization and data residency. Choose the endpoint that best fits your application's requirements—whether that's achieving the lowest possible latency or ensuring your audio data stays within a specific geographic region.
Edge Routing (streaming.edge.assemblyai.com)
automatically routes requests to the nearest available region, minimizing latency for real-time transcription. With infrastructure in Oregon, Virginia, and Ireland, this endpoint delivers our best-in-class streaming performance regardless of where your users are located.
Data Zone Routing (streaming.us.assemblyai.com and streaming.eu.assemblyai.com)
guarantees your data never leaves the specified region. This is designed for organizations with strict data residency and governance requirements—your audio and transcription data will remain entirely within the US or EU, respectively.
How to use it:
- wss://streaming.edge.assemblyai.com/v3/ws — Lowest latency
- wss://streaming.us.assemblyai.com/v3/ws — US data residency
- wss://streaming.eu.assemblyai.com/v3/ws — EU data residency
The default endpoint (streaming.assemblyai.com) remains unchanged.
Original source Report a problem - Jan 2, 2026
- Date parsed from source:Jan 2, 2026
- First seen by Releasebot:Jan 29, 2026
Multichannel Speaker Diarization
Universal adds multichannel speaker diarization with pre-recorded transcription for per-speaker attribution across multiple channels in a single request. Labels like 1A, 1B, 2A simplify complex recordings. Available now for all Universal customers.
Multichannel speaker diarization with pre-recorded transcription
We've added support for multichannel speaker diarization with pre-recorded transcription, allowing you to identify individual speakers across multiple audio channels in a single API request. This unlocks accurate transcription for complex audio scenarios like hybrid meetings, call center recordings with supervisor monitoring, or podcast recordings with multiple mics. Speaker labels are formatted as 1A, 1B, 2A, 2B, where the first digit indicates the channel and the letter identifies unique speakers within that channel. For example, in a meeting where Channel 1 captures an in-room conversation between two people and Channel 2 captures a remote participant, you'll get clear attribution for all three speakers even though Channel 1 contains multiple talkers.
How to use it
- Set both multichannel=true and speaker_labels=true in your transcription request—no other changes needed
- Available now for all Universal customers across all plan tiers
- View documentation
Universal delivers industry-leading accuracy with advanced features like multichannel support and speaker diarization, giving you the precision and flexibility needed to build production-grade voice AI applications.
Original source Report a problem - Jan 2, 2026
- Date parsed from source:Jan 2, 2026
- First seen by Releasebot:Jan 10, 2026
Multichannel Speaker Diarization
Universal adds multichannel speaker diarization with pre-recorded transcription, enabling per-speaker attribution across channels in one API call. Labels like 1A/1B/2A map speakers in hybrid meetings and call-center recordings. Available now for all Universal customers; set multichannel and speaker_labels in requests.
We've added support for multichannel speaker diarization with pre-recorded transcription, allowing you to identify individual speakers across multiple audio channels in a single API request.
This unlocks accurate transcription for complex audio scenarios like hybrid meetings, call center recordings with supervisor monitoring, or podcast recordings with multiple mics. Speaker labels are formatted as 1A, 1B, 2A, 2B, where the first digit indicates the channel and the letter identifies unique speakers within that channel. For example, in a meeting where Channel 1 captures an in-room conversation between two people and Channel 2 captures a remote participant, you'll get clear attribution for all three speakers even though Channel 1 contains multiple talkers.
How to use it
- Set both multichannel=true and speaker_labels=true in your transcription request—no other changes needed
- Available now for all Universal customers across all plan tiers
- View documentation
Universal delivers industry-leading accuracy with advanced features like multichannel support and speaker diarization, giving you the precision and flexibility needed to build production-grade voice AI applications.
Original source Report a problem - Dec 19, 2025
- Date parsed from source:Dec 19, 2025
- First seen by Releasebot:Jan 29, 2026
Improved File Deletion for Enhanced Data Privacy
AssemblyAI now deletes uploaded audio immediately when you delete a transcript, replacing the previous 24 hour delay. This applies to files uploaded via the /upload endpoint and will invalidate shared upload URLs for future requests.
Release notes
We've updated how uploaded audio files are deleted when you delete a transcript, giving you immediate control over your data.
Previously, when you made a DELETE request to remove a transcript, the associated uploaded file would remain in storage for up to 24 hours before automatic deletion. Now, uploaded files are immediately deleted alongside the transcript when you make a DELETE request, ensuring your data is removed from our systems right away.
This change applies specifically to files uploaded via the /upload endpoint. If you're reusing upload URLs across multiple transcription requests, note that deleting one transcript will now immediately invalidate that upload URL for any subsequent requests.
How it works
- When you send a DELETE request to remove a transcript, any file uploaded via /upload and associated with that transcript is now deleted immediately
- This applies to all customers using the /upload endpoint across all plans
- If you need to transcribe the same file multiple times, upload it separately for each request or retain the original file on your end
AssemblyAI's APIs are built with security and data privacy as core principles. Our speech-to-text and audio intelligence models process your data with enterprise-grade security, and now with even more granular control over data retention.
Learn more about our data security practices
Original source Report a problem - Dec 19, 2025
- Date parsed from source:Dec 19, 2025
- First seen by Releasebot:Jan 10, 2026
Gemini 3 Flash Preview now supported on LLM Gateway
Google's newest Gemini 3 Flash Preview model is live in the LLM Gateway.
This model delivers faster inference speeds with improved reasoning capabilities compared to previous Flash versions. Gemini 3 Flash Preview excels at high-throughput applications requiring quick response times—like real-time customer support agents, content moderation, and rapid document processing—while maintaining strong accuracy on complex queries that would have required slower, more expensive models.
For more information, check out our docs here.
Original source Report a problem - Dec 19, 2025
- Date parsed from source:Dec 19, 2025
- First seen by Releasebot:Jan 10, 2026
Improved File Deletion for Enhanced Data Privacy
AssemblyAI now deletes uploaded audio immediately when you delete a transcript, eliminating lingering files from the /upload endpoint. This tightens data control across plans and may affect reusing upload URLs for later requests.
We've updated how uploaded audio files are deleted when you delete a transcript, giving you immediate control over your data.
Previously, when you made a DELETE request to remove a transcript, the associated uploaded file would remain in storage for up to 24 hours before automatic deletion. Now, uploaded files are immediately deleted alongside the transcript when you make a DELETE request, ensuring your data is removed from our systems right away.
This change applies specifically to files uploaded via the /upload endpoint. If you're reusing upload URLs across multiple transcription requests, note that deleting one transcript will now immediately invalidate that upload URL for any subsequent requests.
How it works
- When you send a DELETE request to remove a transcript, any file uploaded via /upload and associated with that transcript is now deleted immediately
- This applies to all customers using the /upload endpoint across all plans
- If you need to transcribe the same file multiple times, upload it separately for each request or retain the original file on your end
AssemblyAI's APIs are built with security and data privacy as core principles. Our speech-to-text and audio intelligence models process your data with enterprise-grade security, and now with even more granular control over data retention.
Learn more about our data security practices
Original source Report a problem