AssemblyAI Release Notes

Last updated: Jan 10, 2026

  • Jan 2, 2026
    • Parsed from source:
      Jan 2, 2026
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Multichannel Speaker Diarization

    Universal adds multichannel speaker diarization with pre-recorded transcription, enabling per-speaker attribution across channels in one API call. Labels like 1A/1B/2A map speakers in hybrid meetings and call-center recordings. Available now for all Universal customers; set multichannel and speaker_labels in requests.

    We've added support for multichannel speaker diarization with pre-recorded transcription, allowing you to identify individual speakers across multiple audio channels in a single API request.

    This unlocks accurate transcription for complex audio scenarios like hybrid meetings, call center recordings with supervisor monitoring, or podcast recordings with multiple mics. Speaker labels are formatted as 1A, 1B, 2A, 2B, where the first digit indicates the channel and the letter identifies unique speakers within that channel. For example, in a meeting where Channel 1 captures an in-room conversation between two people and Channel 2 captures a remote participant, you'll get clear attribution for all three speakers even though Channel 1 contains multiple talkers.

    How to use it

    • Set both multichannel=true and speaker_labels=true in your transcription request—no other changes needed
    • Available now for all Universal customers across all plan tiers
    • View documentation

    Universal delivers industry-leading accuracy with advanced features like multichannel support and speaker diarization, giving you the precision and flexibility needed to build production-grade voice AI applications.‍

    Original source Report a problem
  • Dec 19, 2025
    • Parsed from source:
      Dec 19, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Improved File Deletion for Enhanced Data Privacy

    AssemblyAI now deletes uploaded audio immediately when you delete a transcript, eliminating lingering files from the /upload endpoint. This tightens data control across plans and may affect reusing upload URLs for later requests.

    We've updated how uploaded audio files are deleted when you delete a transcript, giving you immediate control over your data.

    Previously, when you made a DELETE request to remove a transcript, the associated uploaded file would remain in storage for up to 24 hours before automatic deletion. Now, uploaded files are immediately deleted alongside the transcript when you make a DELETE request, ensuring your data is removed from our systems right away.

    This change applies specifically to files uploaded via the /upload endpoint. If you're reusing upload URLs across multiple transcription requests, note that deleting one transcript will now immediately invalidate that upload URL for any subsequent requests.

    How it works

    • When you send a DELETE request to remove a transcript, any file uploaded via /upload and associated with that transcript is now deleted immediately
    • This applies to all customers using the /upload endpoint across all plans
    • If you need to transcribe the same file multiple times, upload it separately for each request or retain the original file on your end

    AssemblyAI's APIs are built with security and data privacy as core principles. Our speech-to-text and audio intelligence models process your data with enterprise-grade security, and now with even more granular control over data retention.

    Learn more about our data security practices

    Original source Report a problem
  • Dec 19, 2025
    • Parsed from source:
      Dec 19, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Gemini 3 Flash Preview now supported on LLM Gateway

    Google's newest Gemini 3 Flash Preview model is live in the LLM Gateway.

    This model delivers faster inference speeds with improved reasoning capabilities compared to previous Flash versions. Gemini 3 Flash Preview excels at high-throughput applications requiring quick response times—like real-time customer support agents, content moderation, and rapid document processing—while maintaining strong accuracy on complex queries that would have required slower, more expensive models.

    For more information, check out our docs here.

    Original source Report a problem
  • Dec 17, 2025
    • Parsed from source:
      Dec 17, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Transcribe public audio URLs directly in the Playground

    Playground: Transcribe audio from public URLs

    Our Playground just got a little more powerful: you can now transcribe audio directly from public URLs.

    No more downloading files just to upload them again. Paste a public audio URL, and you're good to go.

    Try it out in the Playground now.

    Original source Report a problem
  • Dec 16, 2025
    • Parsed from source:
      Dec 16, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    GPT-5.1 & 5.2 now supported on LLM Gateway

    OpenAI’s newest GPT-5.1 and GPT-5.2 models are live in the LLM Gateway

    These models come with sharp reasoning and instruction-following abilties. GPT-5.2 in particular excels at multi-step legal, finance and medical tasks where earlier models stalled, letting you ship production features that previously needed heavy post-processing or human review.

    For more information, check out our docs here.

    Original source Report a problem
  • Dec 5, 2025
    • Parsed from source:
      Dec 5, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Keyterm Prompting Now Available for Universal-Streaming Multilingual

    Keyterm prompting is now in production for multilingual streaming. Developers can improve accuracy for target words in real time across Universal-Streaming. By listing key terms in connection parameters you get better recognition of domain specific vocabulary.

    Keyterm prompting is now in production for multilingual streaming

    Keyterm prompting is now in production for multilingual streaming, giving developers the ability to improve accuracy for target words in real-time transcription. This enhancement is live for all users across the Universal-Streaming platform.

    Keyterm prompting enables developers to prioritize specific terminology in transcription results, which is particularly valuable for conversational AI and voice agent use cases where domain-specific accuracy matters. By specifying keywords relevant to your application, you'll see improved recognition of critical terms that might otherwise be misheard or misinterpreted.

    To use Keyterm prompting with Universal-Streaming Multilingual, include a list of keyterms in your connection parameters:

    CONNECTION_PARAMS = {
      "sample_rate" : 16000 ,
      "speech_model" : "universal-streaming-multilingual" ,
      "keyterms_prompt" : json.dumps([
        "Keanu Reeves" , "AssemblyAI" , "Universal-2"
      ])
    }
    

    Expanding Keyterm prompting to Universal-Multilingual Streaming

    Expanding Keyterm prompting to Universal-Multilingual Streaming reinforces our commitment to giving developers precise control over recognition results for specialized vocabularies.

    Learn more in our docs.

    Original source Report a problem
  • Dec 4, 2025
    • Parsed from source:
      Dec 4, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Hallucination Rate Reduced for Multilingual Streaming

    Improvements

    We've improved hallucination detection and reduction across Universal-Multilingual Streaming transcription, resulting in fewer false outputs while maintaining minimal latency impact. This improvement is live for all users.

    Lower hallucination rates mean more reliable transcription results out of the box, especially in edge cases where model confidence is uncertain. You'll see more accurate, trustworthy outputs without needing to modify existing implementations

    This improvement is automatic and applies to all new Streaming sessions.

    Original source Report a problem
  • Dec 3, 2025
    • Parsed from source:
      Dec 3, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Transcription Access Now Scoped to Project Level for Uploaded Files

    Security controls for pre-recorded file transcription

    We've tightened security controls on pre-recorded file transcription by scoping access to uploaded files within the same project that uploaded them.

    Previously, API tokens could transcribe files across projects. Now, tokens must belong to the same project that originally uploaded the file to transcribe it. This strengthens your security posture and prevents unintended cross-project access to sensitive audio files.

    This security enhancement reflects our commitment to protecting your data and giving you granular control over who can access transcriptions within your organization.

    Original source Report a problem
  • Nov 21, 2025
    • Parsed from source:
      Nov 21, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    AssemblyAI Streaming Updates: Multi-Region Infrastructure, Session Controls, and Self-Hosted License Management

    Self-Hosted Streaming v0.20 lands with license management for enterprise deployment control, multi-region streaming in us-east-1 for redundancy, and configurable inactivity_timeout to curb idle sessions. These updates boost security, resilience, and cost control for live transcription.

    Self-Hosted Streaming v0.20: License Management Now Available

    Self-Hosted Streaming v0.20 now includes built-in license generation and validation, giving enterprises complete control over deployment security and usage tracking. Organizations can manage their speech AI infrastructure with the same compliance controls they expect from enterprise software.

    The new licensing system enables IT teams to track deployment usage, enforce security policies, and maintain audit trails—critical for regulated industries like healthcare and financial services. License validation happens at startup and can be configured for periodic checks to ensure continuous compliance.

    Available now for all AssemblyAI Self-Hosted Streaming customers.

    Contact your account team to generate licenses for your deployments.

    Multi-Region Streaming: US-East-1 Now Live

    AssemblyAI's Streaming API is now available in us-east-1 , providing regional redundancy and expanded compute capacity for production workloads. The infrastructure update reduces single-region dependency and prepares the platform for upcoming EU deployment.

    Multi-region availability means contact centers and live captioning applications can maintain service continuity during regional incidents while accessing additional compute capacity for peak usage periods. The architecture changes also enable faster rollout of new regions based on customer demand.

    Available immediately across all AssemblyAI's Streaming API plans.
    Traffic is automatically routed to the optimal region based on latency and capacity.

    Try AssemblyAI’s Streaming API now or view regional availability .

    Inactivity Timeout Controls for Streaming Sessions

    AssemlyAI’s Streaming API now supports configurable inactivity_timeout parameters, giving developers precise control over session duration management. Applications can extend timeout periods for long-running sessions or reduce them to optimize connection costs.

    The feature enables voice agents and live transcription systems to automatically close idle connections without manual intervention. Contact centers can reduce costs on silent periods while ensuring active calls stay connected. Voice agent developers can keep sessions open longer during natural conversation pauses without manual keep-alive logic.

    Available now for all AssemblyAI Streaming customers.
    Set the inactivity_timeout parameter (in seconds) when initializing your connection.

    • Set inactivity_timeout in your connection parameters
    • Values range from 5 to 3600 seconds
    • Default timeout remains 30 seconds if not specified
    • Available across all pricing tiers

    View our documentation to learn more.

    Original source Report a problem
  • Nov 21, 2025
    • Parsed from source:
      Nov 21, 2025
    • Detected by Releasebot:
      Jan 10, 2026
    AssemblyAI logo

    AssemblyAI

    Streaming Model Update: Enhanced Performance & New Capabilities

    New English Streaming model release delivers big gains in accuracy and speed plus two new features. Expect 88% better accuracy on short utterances, 12% faster emission, and 7% quicker transcripts, plus 4% better keyterm accuracy. Added language detection per utterance and dynamic keyterms prompting.

    We've released a new version of our English Streaming model with significant improvements across the board.

    Across the board:

    • 88% better accuracy on short utterances and repeating commands/numbers
    • 12% faster emission latency
    • 7% faster time to complete transcript
    • 4% better accuracy on prompted keyterms
    • 3% better accuracy on accented speech

    Performance gains

    • 88% better accuracy on short utterances and repeating commands/numbers
    • 12% faster emission latency
    • 7% faster time to complete transcript
    • 4% better accuracy on prompted keyterms
    • 3% better accuracy on accented speech

    New features

    • Language detection for utterances (Multilingual model only) – Get language output for each utterance to feed downstream processes like LLMs
    • Dynamic keyterms prompting – Update your keyterms list mid-stream to improve accuracy on context you discover during the conversation
    Original source Report a problem

Related vendors