AssemblyAI Release Notes

63 release notes curated from 1 source by the Releasebot Team. Last updated: May 28, 2026

Get this feed:
  • May 20, 2026
    • Date parsed from source:
      May 20, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    Gemini 3.5 Flash now supported on LLM Gateway

    AssemblyAI adds Google’s Gemini 3.5 Flash through LLM Gateway for fast, cost-efficient high-throughput workloads.

    Google's Gemini 3.5 Flash is now available through LLM Gateway.

    Flash is Google's fast, cost-efficient model in the Gemini 3 family — built for high-throughput workloads where latency and price-per-token matter as much…

    Original source
  • May 15, 2026
    • Date parsed from source:
      May 15, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    Streaming PII Redaction

    AssemblyAI adds PII redaction for Streaming Speech-to-Text to automatically remove sensitive information in real time.

    PII Redaction is now available for Streaming Speech-to-Text.

    Set redact_pii: true on a streaming connection to automatically detect and remove sensitive information — names, phone numbers, email addresses, payment…

    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from AssemblyAI and hundreds of other software products.

    Create account
  • May 6, 2026
    • Date parsed from source:
      May 6, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    Streaming Speaker Diarization: Major Accuracy Upgrade with Per-Word Labels

    AssemblyAI ships a major upgrade to streaming speaker diarization with stronger accuracy and per-word speaker labels.

    We’ve shipped a major upgrade to streaming speaker diarization, with significant accuracy gains and a refined API that delivers per-word speaker labels…

    Original source
  • May 6, 2026
    • Date parsed from source:
      May 6, 2026
    • First seen by Releasebot:
      May 19, 2026
    AssemblyAI logo

    AssemblyAI

    LLM Gateway: JSON Repair Post-Processing for Structured Output

    AssemblyAI adds post-processing to LLM Gateway completions, starting with json-repair for automatically fixing malformed JSON before it reaches your app. The new pipeline works across every model and is available now with a single request parameter.

    LLM Gateway completions now support a post-processing pipeline, and the first step available is json-repair — an optional pass that automatically fixes malformed JSON returned by a model before it reaches your application. Enable it with a single new parameter on your existing request.

    Anyone working with structured output or tool calling has seen the failure mode: the model returns JSON with a trailing comma, an unescaped quote, a missing brace, or a stray markdown fence — and your downstream parser blows up on a response that was 99% correct. json-repair catches these errors at the Gateway layer and returns clean, parseable JSON to your client, so you don't have to ship your own repair logic, retry the call, or wrap every parse in a try/except.

    The new post_processing_steps field is designed to be extensible — JSON repair is the first transformation we support, with more steps to come. Steps run in order on the model's completion before the response is returned, so you can compose them into a deterministic post-processing pipeline that works the same across every model in the Gateway.

    How to use it

    Add a post_processing_steps array to your LLM Gateway request with {"type": "json-repair"}:

    {
      "model": "gemini-2.5-flash-lite",
      "messages": [
        {
          "role": "user",
          "content": "return exactly with no extra characters, do not fix the json: {\"name\":\"extra comma\",}"
        }
      ],
      "post_processing_steps": [{"type": "json-repair"}]
    }
    
    • Works with every model available through LLM Gateway — no model-specific configuration needed
    • Steps execute in the order they appear in the array, so future steps will compose predictably
    • Available now for all LLM Gateway users in every region

    AssemblyAI's LLM Gateway gives you a single API to access 20+ models from Claude, GPT, Gemini, and more — swap models with a single parameter change, with built-in fallbacks, prompt caching, and now post-processing baked in.

    Original source
  • May 6, 2026
    • Date parsed from source:
      May 6, 2026
    • First seen by Releasebot:
      May 19, 2026
    AssemblyAI logo

    AssemblyAI

    Streaming Speaker Diarization: Major Accuracy Upgrade with Per-Word Labels

    AssemblyAI ships a major upgrade to streaming speaker diarization with sharper accuracy and per-word speaker labels. The update is live in production across US and EU regions for Universal-3 Pro Streaming and Universal-Streaming, improving mid-turn speaker change detection without integration changes.

    How to use it

    We've shipped a major upgrade to streaming speaker diarization, with significant accuracy gains and a refined API that delivers per-word speaker labels. The new model is live now in production across both US and EU regions for Universal-3 Pro Streaming and Universal-Streaming — no integration changes required to benefit from the accuracy improvements.

    Across our internal benchmarks, the upgrade reduces false-alarm speakers by 66% and phantom turn rate by 60%, while improving cpWER by 12% overall and 24% on 2-speaker conversations. Against the closest competitive alternative (Deepgram Nova-3), the new model delivers 2x better cpWER on 2-speaker telephony, 13% better cpWER on 4-speaker meetings, 43% fewer false-alarm speakers, and 91% fewer phantom turns and words attributed to non-existent speakers.

    Alongside the accuracy gains, each word object within a Turn now carries its own speaker label, enabling much more refined mid-turn speaker change detection. Previously, every word in a Turn inherited the Turn's speaker_label; now, when a different speaker briefly cuts in mid-turn, the individual word objects reflect that change — and words the model can't confidently attribute are tagged UNKNOWN rather than rolled into the dominant speaker. This unlocks accurate attribution in fast back-and-forths, brief interjections, and noisy multi-speaker calls where speakers overlap or trade off mid-sentence.

    • Live now in production across US and EU regions for Universal-3 Pro Streaming and Universal-Streaming — no config changes required to get the accuracy improvements
    • Each word in a Turn message now includes a speaker field alongside start, end, text, confidence, and word_is_final
    • Words the model cannot confidently attribute to a known speaker are labeled UNKNOWN — opt into per-word attribution by reading from words[].speaker
    • The Turn-level speaker_label field is unchanged, so existing integrations continue to work without modification
    • For best-in-class diarization accuracy, we recommend Universal-3 Pro Streaming ("speech_model": "u3-rt-pro")

    AssemblyAI's Universal-Streaming API is the most accurate, lowest-latency way to build real-time voice applications — and with this upgrade, it now delivers the most precise speaker attribution in production speech AI.

    Original source
  • May 4, 2026
    • Date parsed from source:
      May 4, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    LLM Gateway: JSON Repair Post-Processing for Structured Output

    AssemblyAI adds post-processing for LLM Gateway completions, including optional json-repair to fix malformed JSON before delivery.

    LLM Gateway completions now support a post-processing pipeline, and the first step available is json-repair — an optional pass that automatically fixes malformed JSON returned by a model before it reaches your application…

    Original source
  • Apr 29, 2026
    • Date parsed from source:
      Apr 29, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    Introducing the Voice Agent API

    AssemblyAI now offers the Voice Agent API, a complete voice agent pipeline delivered through a single WebSocket.

    The Voice Agent API is now available — a complete voice agent pipeline built on AssemblyAI's own models, delivered through a single WebSocket…

    Original source
  • Apr 29, 2026
    • Date parsed from source:
      Apr 29, 2026
    • First seen by Releasebot:
      May 1, 2026
    AssemblyAI logo

    AssemblyAI

    Introducing the Voice Agent API

    AssemblyAI launches the Voice Agent API, a complete voice agent pipeline over a single WebSocket with built-in speech understanding, LLM reasoning, and voice generation. It adds server-side turn detection, live configuration, tool calling, session resumption, and simple all-in pricing.

    The Voice Agent API is now available — a complete voice agent pipeline built on AssemblyAI's own models, delivered through a single WebSocket. Stream audio in, get audio back, and pay one all-in rate of $4.50/hr that covers speech understanding, LLM reasoning, and voice generation.

    The API runs on Universal-3 Pro Streaming, the same speech model that already powers production voice stacks — accurate on names, account numbers, domain terminology, and accented speech across six languages. Turn detection runs server-side with configurable thresholds, so the agent knows the difference between a thinking pause and an end-of-turn, and interruptions stop the agent immediately. Listening that actually works is the foundation; everything downstream gets better when the transcription and turn-taking are right.

    The developer experience is designed to get out of the way. No SDK to install, no framework to learn — the entire API surface is JSON over WebSocket and most teams ship a working agent the same afternoon they start. Live configuration lets you update system prompts, tools, or turn detection mid-conversation with no reconnect. Tool calling with JSON Schema lets the agent take real actions through your custom functions. Session resumption restores full context if a WebSocket drops within 30 seconds.

    How to use it

    • Open a WebSocket connection to the Voice Agent API endpoint and stream audio in; receive audio and event messages back as JSON
    • Configure agent behavior at session start or mid-conversation — system prompt, tools, turn detection thresholds — via standard JSON message types
    • Register custom functions with JSON Schema for tool calling; reconnect within 30 seconds with session resumption to preserve context on dropped connections
    • Single billing line at $4.50/hr covering STT, LLM, and TTS — measured in audio hours, no separate metering for each pipeline stage
    • Available now to all customers; works end-to-end with Claude Code for scaffolding integrations directly from your terminal when using our

    AssemblyAI Docs MCP

    The Voice Agent API is invisible infrastructure for production voice products — accurate listening, natural turn-taking, and a developer surface small enough to read in 10 minutes. Your customers should feel like you built it for them, not like they're using a platform.

    Original source
  • Apr 25, 2026
    • Date parsed from source:
      Apr 25, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    PII Redaction: Return Unredacted Transcripts in the Same Request

    AssemblyAI adds a new PII redaction flag to return both redacted and unredacted transcripts in one request.

    You can now retrieve both the redacted and unredacted versions of a transcript in a single PII Redaction request. Set the new redact_pii_return_unredacted flag to true in your POST /v2/transcript body, and the response…

    Original source
  • Apr 25, 2026
    • Date parsed from source:
      Apr 25, 2026
    • First seen by Releasebot:
      Apr 27, 2026
    AssemblyAI logo

    AssemblyAI

    PII Redaction: Return Unredacted Transcripts in the Same Request

    AssemblyAI adds a new PII Redaction option that returns both redacted and unredacted transcript data in one request. The opt-in update includes original text, words, and utterances alongside the redacted output, reducing extra API calls for compliance workflows.

    You can now retrieve both the redacted and unredacted versions of a transcript in a single PII Redaction request. Set the new redact_pii_return_unredacted flag to true in your POST /v2/transcript body, and the response will include the original text, words, and utterances alongside the redacted output — no second API call required.

    The new fields are purely additive. text, words, and utterances stay fully redacted as before, and three new top-level fields — unredacted_text, unredacted_words, and unredacted_utterances — are returned alongside them with the original PII intact. The unredacted word and utterance arrays mirror the exact shape of their redacted counterparts (text, start, end, confidence, speaker, channel).

    This is an opt-in convenience for workflows that need both versions in the same place — for example, a UI that toggles between redacted-first and unredacted views, or a dual-pipeline that stores compliance-grade redacted output for sharing while preserving the original in a trusted environment. It removes the need for previously brittle workarounds like sending two API requests, doing client-side redaction via Entity Detection, or post-hoc LLM-based redaction.

    How to use it

    Add redact_pii_return_unredacted: true alongside the existing PII parameters in your transcription request:

    {
      "audio_url": "YOUR_AUDIO_URL",
      "redact_pii": true,
      "redact_pii_return_unredacted": true,
      "redact_pii_policies": ["person_name", "phone_number", "email_address"],
      "redact_pii_sub": "entity_name"
    }
    
    • Requires redact_pii: true — sending redact_pii_return_unredacted: true on its own returns HTTP 400
    • Defaults to false; when off or absent, responses are unchanged and the three unredacted_* fields are not returned
    • Works with all existing PII params, including redact_pii_policies, redact_pii_sub, and redact_pii_audio
    • Available now on Pre-recorded transcription, with SDK support live across Python and JavaScript

    AssemblyAI's PII Redaction automatically detects and removes sensitive information from both transcripts and audio — giving you compliant, production-ready output without extra processing steps.

    Original source
  • Apr 19, 2026
    • Date parsed from source:
      Apr 19, 2026
    • First seen by Releasebot:
      Apr 20, 2026
    • Modified by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    Claude Opus 4.7 Now Available on LLM Gateway

    AssemblyAI adds Claude Opus 4.7 through LLM Gateway for stronger reasoning, coding, and multi-step tasks.

    Claude Opus 4.7 is now available through LLM Gateway. Opus 4.7 is Anthropic's most intelligent model yet — the latest in the Claude family, pushing the frontier on reasoning, coding, and complex multi-step tasks…

    Original source
  • Apr 2, 2026
    • Date parsed from source:
      Apr 2, 2026
    • First seen by Releasebot:
      Apr 2, 2026
    • Modified by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    Universal-2 Language Improvements: Hebrew & Swedish

    AssemblyAI improves Universal-2 transcription accuracy for Hebrew and Swedish with major word error rate reductions.

    Universal-2 transcription accuracy has improved significantly for Hebrew and Swedish, with word error rates reduced by 37% and 47% respectively…

    Original source
  • Mar 25, 2026
    • Date parsed from source:
      Mar 25, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    LLM Gateway: Automatic Model Fallbacks

    AssemblyAI adds automatic model fallbacks in LLM Gateway for more resilient apps without integration changes.

    LLM Gateway now supports automatic model fallbacks, giving your application resilience against model failures without changing your integration…

    Original source
  • Mar 25, 2026
    • Date parsed from source:
      Mar 25, 2026
    • First seen by Releasebot:
      May 28, 2026
    AssemblyAI logo

    AssemblyAI

    Introducing Medical Mode: Purpose-built accuracy for medical terminology

    AssemblyAI adds Medical Mode to Streaming Speech-to-Text for better transcription of medical terminology.

    Medical Mode is a new add-on for AssemblyAI's Streaming Speech-to-Text that improves transcription accuracy for medical terminology — including medication names, procedures, conditions, and dosages…

    Original source
  • Mar 25, 2026
    • Date parsed from source:
      Mar 25, 2026
    • First seen by Releasebot:
      Apr 2, 2026
    AssemblyAI logo

    AssemblyAI

    LLM Gateway: Automatic Model Fallbacks

    AssemblyAI adds automatic model fallbacks to LLM Gateway, bringing built-in resilience against model failures without changing integrations. The public beta lets users retry with fallback models or the same model after 500ms, with configurable fallback depth and retry behavior.

    LLM Gateway now supports automatic model fallbacks, giving your application resilience against model failures without changing your integration. If a model returns a server error, the Gateway will automatically retry with a fallback — or retry the same model after 500ms by default.

    This is available now in Public Beta for all LLM Gateway users.

    How to use it

    Add a fallbacks array and optional fallback_config to your request. All fields from the original request are copied over to the fallback automatically — you only need to specify what you want to override.

    Simple fallback — fall back to a different model, inheriting all original parameters:

    {
      "model": "kimi-k2.5",
      "messages": [
        {
          "role": "user",
          "content": "Summarize this meeting: ..."
        }
      ],
      "temperature": 0.2,
      "fallbacks": [
        {
          "model": "claude-sonnet-4-6"
        }
      ]
    }
    

    Advanced fallback — override specific parameters when falling back (e.g., different prompt or temperature for a different model's behavior):

    {
      "model": "kimi-k2.5",
      "messages": [
        {
          "role": "user",
          "content": "Summarize this meeting: ..."
        }
      ],
      "temperature": 0.2,
      "fallbacks": [
        {
          "model": "claude-sonnet-4-6",
          "messages": [
            {
              "role": "user",
              "content": "Summarize this meeting concisely, key info only: ..."
            }
          ],
          "temperature": 0.3
        }
      ]
    }
    

    Fallback config options:

    "fallback_config": {
      "depth": 1,
      "retry": true
    }
    

    By default, if no fallbacks are set, the API will automatically retry a failed request after 500ms. For more control, set fallback_config.retry to false and implement your own exponential backoff.

    AssemblyAI's LLM Gateway gives you a single API to access leading models from every major provider — with built-in resilience, load balancing, and cost tracking.

    Original source
Releasebot

Curated by the Releasebot team

Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.

Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.

Similar to AssemblyAI with recent updates: