Deepgram Release Notes

Last updated: Apr 6, 2026

  • Apr 6, 2026
    • Date parsed from source:
      Apr 6, 2026
    • First seen by Releasebot:
      Apr 6, 2026
    Deepgram logo

    Deepgram

    Two Fresh Deepgram SDKs: JavaScript v5 and Python v6 Are Now Generally Available

    Deepgram releases major JavaScript SDK v5 and Python SDK v6, bringing stable, production-ready updates with generated APIs, better TypeScript and WebSocket support, custom transports, and a new SageMaker transport for the Python SDK.

    These two major releases are now stable and ready for production use. Here's the latest and greatest from each release:

    Deepgram JavaScript SDK v5: A New Architecture, Better TypeScript

    Install:

    npm install @deepgram/[email protected]
    

    The biggest change in v5? The SDK is now automatically generated directly from our API specs using Fern! Previously, types were maintained by hand... which sounds fine until you've spent 20 minutes chasing a type error on a field the API never actually makes optional. Generated types mean the TypeScript reflects what the API actually does, not what someone thought it did when they last updated the definitions.

    New API features also land in the SDK through automated generation PRs, rather than waiting on a manual type update. Filler words, agent TTS provider fallback, mip_opt_out, ttl_seconds — they're all there. Flux model support is in too.

    Before (v4):

    import { Deepgram } from "@deepgram/sdk";
    const deepgram = new Deepgram(process.env.DEEPGRAM_API_KEY!);
    const { result, error } = await deepgram.transcription.preRecorded({ url: "https://example.com/audio.wav" }, { model: "nova-2" });
    const transcript = result?.results?.channels?.[0]?.alternatives?.[0]?.transcript ?? "";
    

    After (v5):

    import { DeepgramClient } from "@deepgram/sdk";
    import type { ListenV1Response } from "@deepgram/sdk";
    const client = new DeepgramClient();
    const response: ListenV1Response = await client.listen.v1.media.transcribeUrl({ url: "https://example.com/audio.wav" }, { model: "nova-3" });
    const transcript = response.results.channels[0].alternatives[0].transcript;
    

    Migration guide (v4 → v5) · GitHub · npm

    Deepgram Python SDK v6: Generated WebSockets and Custom Transports

    Install:

    pip install [email protected]
    

    v5 generated the REST clients from the spec... but v6 now does the same for WebSockets!

    That's it, that's the post. The live transcription, TTS, and agent WebSocket connections were hand-rolled in v5. This led to inconsistencies that were annoying at best and confusing at worst. In v6 they're generated from the AsyncAPI spec, so they behave consistently and the types actually match what comes back over the wire.

    A few other things that changed:

    • send_media() now takes raw bytes. Control messages have proper named methods — send_keep_alive(), send_finalize(), send_flush() — instead of the old send_control({"type": "..."}) pattern that required you to look up the right structure every time.
    • Types are now imported from their feature namespace (deepgram.listen.v1.types, deepgram.agent.v1.types) rather than a shared barrel. Autocomplete is actually useful now.
    • Custom transport support! You can swap out the built-in WebSocket transport with your own implementation, which is useful for testing, proxied environments, or non-standard protocols.
    • SageMaker transport! There's a first-party deepgram-sagemaker package for running Deepgram models on AWS SageMaker endpoints via HTTP/2 bidirectional streaming, under the same SDK interface.

    Before (v5):

    from deepgram import DeepgramClient
    client = DeepgramClient()
    connection = client.listen.live(model = "nova-2", encoding = "linear16", sample_rate = 16000)
    connection.on("open", lambda _: print("Connection opened"))
    connection.on("message", lambda msg: print(msg))
    connection.send_control({"type": "KeepAlive"})
    connection.send_control({"type": "Finalize"})
    

    After (v6):

    from deepgram import DeepgramClient
    from deepgram.core.events import EventType
    client = DeepgramClient()
    with client.listen.v2.connect(model = "nova-3", encoding = "linear16", sample_rate = 16000) as connection:
        connection.on(EventType.OPEN, lambda _: print("Connection opened"))
        connection.on(EventType.MESSAGE, lambda msg: print(msg.type))
        connection.send_keep_alive()
        connection.send_finalize()
        connection.start_listening()
    

    Migration guide (v5 → v6) · GitHub · PyPI

    Get Started with Deepgram’s SDKs

    If you're new to Deepgram, grab a free API key and pick your SDK:

    JavaScript / TypeScript:

    npm install @deepgram/[email protected]
    

    Python:

    pip install [email protected]
    

    These are major releases and existing projects that are upgrading will likely break, so make sure to follow our in-depth migration guides. Run into something unexpected? Open an issue on GitHub or come find us in the Deepgram Discord.

    What's Coming for Deepgram?

    Already in the pipeline: SageMaker transport for the JavaScript SDK, on-the-fly configuration updates for both SDKs, Flux multilingual support, and new versions of the Rust and Go SDKs — plus a Java SDK that's currently in the works.

    The Deepgram Discord is the best place to follow along as these land — that's where we share early updates, answer questions from developers actively building, and take feedback that ends up shaping what we prioritise next.

    Original source Report a problem
  • Apr 3, 2026
    • Date parsed from source:
      Apr 3, 2026
    • First seen by Releasebot:
      Apr 4, 2026
    Deepgram logo

    Deepgram

    April 3, 2026

    Deepgram adds NVIDIA as a supported LLM provider for the Voice Agent API, bringing new Standard-tier model options for high-accuracy reasoning and cost-efficient agentic tasks.

    NVIDIA LLM provider now available

    NVIDIA is now a supported LLM provider for the Voice Agent API. Two models are available in the Standard pricing tier:

    • llama-nemotron-super-49B — Llama Nemotron Super 49B delivers high accuracy for multi-agentic reasoning.
    • nemotron-3-nano-30B-A3B — Nemotron 3 Nano 30B A3B provides cost efficiency with high accuracy for targeted agentic tasks.

    Set the provider type to nvidia in your agent configuration:

    {
      "agent": {
        "think": {
          "provider": {
            "type": "nvidia",
            "model": "llama-nemotron-super-49B",
            "temperature": 0.7
          }
        }
      }
    }
    

    NVIDIA is a managed provider, so the endpoint field is optional. For the full list of supported models and pricing tiers, see the Voice Agent LLM Models documentation.

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from Deepgram and hundreds of other software products.

  • Apr 2, 2026
    • Date parsed from source:
      Apr 2, 2026
    • First seen by Releasebot:
      Apr 3, 2026
    Deepgram logo

    Deepgram

    April 2, 2026

    Deepgram releases its April 2026 Self-Hosted update with container image refreshes, a certificate endpoint fix, and model name consistency improvements for /v1/models, plus general software upkeep.

    Deepgram Self-Hosted April 2026 Release (260402)

    Container Images (release 260402)

    quay.io/deepgram/self-hosted-api:release-260402

    Equivalent image to:

    quay.io/deepgram/self-hosted-api:1.181.3

    quay.io/deepgram/self-hosted-engine:release-260402

    Equivalent image to:

    quay.io/deepgram/self-hosted-engine:3.114.5

    Minimum required NVIDIA driver version:

    =570.172.08

    quay.io/deepgram/self-hosted-license-proxy:release-260402

    Equivalent image to:

    quay.io/deepgram/self-hosted-license-proxy:1.10.1

    quay.io/deepgram/self-hosted-billing:release-260402

    Equivalent image to:

    quay.io/deepgram/self-hosted-billing:1.13.0

    This Release Contains The Following Changes

    Certificate Endpoint Fix — Engine now responds to /v1/certificates in addition to /certificates , consistent with the other container images. See Certificate Status for details.

    Model Name Consistency — The /v1/models endpoint now returns a canonical_name field matching the model name used in /v1/listen requests.

    General Improvements — Keeps our software up-to-date.

    Original source Report a problem
  • Mar 31, 2026
    • Date parsed from source:
      Mar 31, 2026
    • First seen by Releasebot:
      Apr 1, 2026
    Deepgram logo

    Deepgram

    March 31, 2026

    Deepgram expands Nova-3 with new Mandarin language support for Simplified and Traditional Chinese codes.

    Nova-3 Model Update

    🌏 Nova-3 now supports the following new languages and language codes:

    • Chinese (Mandarin, Simplified): zh, zh-CN, zh-Hans
    • Chinese (Mandarin, Traditional): zh-TW, zh-Hant

    Access these models by setting model="nova-3" and the relevant language code in your request.

    Learn more about Nova-3 and supported languages on the Models and Language Overview page.

    Original source Report a problem
  • Mar 26, 2026
    • Date parsed from source:
      Mar 26, 2026
    • First seen by Releasebot:
      Mar 27, 2026
    Deepgram logo

    Deepgram

    March 26, 2026

    Deepgram adds TTS speed controls and updated Voice Agent LLM models, including Early Access speaking-rate control for Deepgram TTS and new OpenAI models in Standard pricing. It also deprecates gemini-2.0-flash in favor of newer Gemini options.

    TTS speed controls & updated LLM models

    TTS speak speed (Early Access)

    You can now control the speaking rate of Deepgram TTS in the Voice Agent API using the agent.speak.provider.speed parameter. This parameter accepts a float value between 0.7 and 1.5, with 1.0 as the default.

    This feature is in Early Access and is only available for Deepgram TTS. For more details, see TTS voice controls. To request access, contact your Account Executive or reach out to [email protected].

    Updated LLM models

    New OpenAI models — Two new models are now available in the Standard pricing tier:

    • gpt-5.4-nano
    • gpt-5.4-mini

    Gemini 2.0 Flash deprecated — The gemini-2.0-flash model is now deprecated. We recommend migrating to gemini-2.5-flash or a newer Gemini model. See the Google models table for alternatives.

    For the full list of supported models and pricing tiers, see the Voice Agent LLM Models documentation.

    Original source Report a problem
  • Mar 19, 2026
    • Date parsed from source:
      Mar 19, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Deepgram logo

    Deepgram

    March 19, 2026

    Deepgram ships its March 2026 self-hosted release with a Flux regression fix, broader Nova-3 language support, new Flux status metrics, a certificate status endpoint, and configurable log formats, while also keeping the software up to date.

    Deepgram Self-Hosted March 2026 Release (260319)

    Container Images (release 260319)

    quay.io/deepgram/self-hosted-api:release-260319

    Equivalent image to:

    quay.io/deepgram/self-hosted-api:1.180.1

    quay.io/deepgram/self-hosted-engine:release-260319

    Equivalent image to:

    quay.io/deepgram/self-hosted-engine:3.114.4

    Minimum required NVIDIA driver version:

    =570.172.08

    quay.io/deepgram/self-hosted-license-proxy:release-260319

    Equivalent image to:

    quay.io/deepgram/self-hosted-license-proxy:1.10.1

    quay.io/deepgram/self-hosted-billing:release-260319

    Equivalent image to:

    quay.io/deepgram/self-hosted-billing:1.13.0

    This Release Contains The Following Changes

    • Flux Regression Fix — Resolves Flux support regression from the 260305 release. See Deploy Flux Model (STT) for deployment details.
    • Nova-3 Language Expansion — New models: Thai (th, th-TH), Chinese Cantonese Traditional (zh-HK). Improved models: Bengali (bn), Marathi (mr), Tamil (ta), Telugu (te). See the full announcement for details.
    • Flux Status Metrics — Self-hosted status endpoint now includes Flux stream metrics. See Status Endpoint for details.
    • Certificate Status Endpoint — New /v1/certificates endpoint on all container images returns beginning-of-support, end-of-support, and end-of-life dates. See Certificate Status for details.
    • Log Formats — New configurable log output formats: Full, Compact, Pretty, Json. See Log Formats for configuration details.
    • General Improvements — Keeps our software up-to-date.
    Original source Report a problem
  • Mar 17, 2026
    • Date parsed from source:
      Mar 17, 2026
    • First seen by Releasebot:
      Mar 18, 2026
    Deepgram logo

    Deepgram

    March 17, 2026

    Deepgram releases Nova-3 update adding zh-HK Cantonese and Thai (th, th-TH) plus improved models for bn, mr, ta, te.

    Nova-3 Model Update

    🌏 Nova-3 now supports the following new languages and language codes:

    • Chinese (Cantonese, Traditional): zh-HK
    • Thai: th, th-TH

    🚀 Also releasing improved Nova-3 models for the following languages:

    Bengali ( bn )

    Marathi ( mr )

    Tamil ( ta )

    Telugu ( te )

    Access these models by setting

    model="nova-3" and the relevant language code in your request.
    

    Learn more about Nova-3 and supported languages on the Models and Language Overview page.

    Original source Report a problem
  • Mar 16, 2026
    • Date parsed from source:
      Mar 16, 2026
    • First seen by Releasebot:
      Mar 17, 2026
    • Modified by Releasebot:
      Mar 18, 2026
    Deepgram logo

    Deepgram

    March 16, 2026

    Deepgram releases expanded Voice Agent API with new LLM models GPT-5.3, GPT-5.4 and Gemini 3.1 Flash Lite, plus GPT-5.2 Instant pricing fix.

    🤖 New LLM Models Support & Bug Fixes

    We’ve added support for new LLM models in the Voice Agent API:

    • OpenAI GPT-5.3 Instant ( gpt-5.3-chat-latest )
    • OpenAI GPT 5.4 ( gpt-5.4 )
    • Google Gemini 3.1 Flash Lite ( gemini-3.1-flash-lite )

    Example:

    {...}
    

    For the full list of supported models and pricing tiers, visit our Voice Agent LLM Models documentation.

    Fixes

    Resolves an issue where the GPT-5.2 Instant model used an incorrect model ID and pricing tier. The model now uses the correct ID ( gpt-5.2-chat-latest ) and is assigned to the Advanced tier.

    Original source Report a problem
  • Mar 12, 2026
    • Date parsed from source:
      Mar 12, 2026
    • First seen by Releasebot:
      Mar 12, 2026
    Deepgram logo

    Deepgram

    Deepgram STT, Now Native on Together AI

    Deepgram partners with Together AI to bring native STT on the platform, letting teams run STT, LLM, and TTS in one place with Deepgram powering the speech layer. It simplifies stacks, lowers latency, and keeps transcripts and logs centralized for production voice agents.

    Deepgram speech-to-text is now natively available on Together AI, so you can run your STT, LLM, and TTS on a single platform while keeping the speech layer powered by Deepgram.

    5 min read

    By
    Pippin Bongiovanni

    Senior Product Marketing Manager, Partner Marketing

    UPDATED
    Mar 12, 2026

    Real-time voice agents only work if they can listen and respond as quickly as a human.

    That comes down to three things working together: fast infrastructure, reliable models, and a simple way to run them in one place. Today, we’re making that last part easier. Deepgram speech-to-text is now natively available on Together AI, so you can run your STT, LLM, and TTS on a single platform while keeping the speech layer powered by Deepgram.

    "Speed and accuracy are non-negotiable for production voice agents. Voice capabilities powered by Deepgram give Together AI developers a reliable speech layer that keeps up with real-time conversation, all within our co-located infrastructure."

    • Arielle Fidel, VP Strategic Partnerships, Together AI

    What’s New

    If you’re already building on Together AI, you don’t need to rethink your stack to get Deepgram. You can now pick Deepgram as the STT engine inside your existing Together AI voice pipelines and keep everything—audio, tokens, and logs—on one platform.

    With this integration, you can:

    • Choose Deepgram as your STT engine inside Together AI’s voice pipelines.
    • Keep STT, LLM, and TTS co-located instead of hopping across multiple vendors.
    • Use one API and one bill, while still getting Deepgram quality on every transcript.
    • Maintain access to the full transcript and response text for logging, QA, and routing.
    • No extra glue code. No new vendor to wire in. Just Deepgram inside your existing Together AI setup.

    Why This Matters for Voice Agents

    Teams building voice agents tend to run into the same problems as they move from demo to production: latency creeps up, accuracy drops in real-world environments, and operations get complicated fast.

    This integration is designed to reduce that friction.

    • Faster turn-taking. With Deepgram hosted directly on Together AI, audio doesn’t have to leave the environment just to be transcribed. That keeps end-to-end latency low enough that users can interrupt, clarify, and keep talking without awkward gaps.
    • Better understanding of real calls. Deepgram is tuned for real customer audio—contact centers, financial calls, healthcare workflows, sales conversations—not just clean lab recordings. That means fewer misheard entities, fewer “sorry, can you repeat that?”, and smoother handoffs to humans when needed.
    • A simpler stack to run. You get Together AI’s unified control plane plus Deepgram at the speech layer. Instead of juggling multiple dashboards and support paths, you can focus on how your agent behaves, not how the plumbing is wired together.

    If you already rely on Together AI , this is the most direct way to upgrade your STT without rebuilding your architecture.

    How Deepgram Fits into Your Voice Stack

    Deepgram is the voice layer that sits underneath your agent experience. Our platform covers the full surface area of voice:

    • Speech-to-Text (Nova & Flux) for real-time and batch transcription, tuned for accuracy and low latency.
    • Text-to-Speech (Aura) for natural voices that are built for production, not just demos.
    • Voice Agent API that combines STT, orchestration, and TTS into a single real-time API for teams who want Deepgram to run the full conversational pipeline.

    With Deepgram now hosted on Together AI, you have options for how deep you go:

    • You can stay on Together AI for model hosting and orchestration, and simply choose Deepgram as your embedded STT engine there. This keeps your LLMs, TTS, and infrastructure where they are today, while upgrading the speech recognition layer.
    • Or you can pair this with Deepgram’s own Voice Agent API, dedicated environments, or self-hosted deployments when you need more control over compliance, routing, or multi-region architecture. In both cases, Deepgram becomes the speech backbone for voice agents that have to work in production, not just in slideware.

    See It in Action and Start Building

    You don’t have to imagine how this feels, you can call it.

    Call the live demo. Dial (847) 851-4323 to talk to a real-time voice agent running on Together AI’s co-located pipeline. Interrupt it mid-sentence, change topics, and notice how quickly it recovers.

    Use the Together AI docs. Follow their voice quickstart and explore the Together AI voice platform to configure a pipeline, then plug in Deepgram STT as your transcription layer.

    Explore Deepgram. Visit deepgram.com to learn more about our Speech-to-Text, Text-to-Speech, and Voice Agent APIs, and to grab an API key for your own apps.

    We’re excited to see what you build when Together AI’s infrastructure and Deepgram’s voice AI are working side by side: one platform, real-time performance, and speech that just works.

    Original source Report a problem
  • Mar 10, 2026
    • Date parsed from source:
      Mar 10, 2026
    • First seen by Releasebot:
      Mar 11, 2026
    • Modified by Releasebot:
      Mar 17, 2026
    Deepgram logo

    Deepgram

    March 10, 2026

    Deepgram releases updated Nova-3 Swedish and Dutch models with improved accuracy for streaming and batch transcription.

    Nova-3 Model Update

    🎯 Nova-3 Swedish and Dutch Model Enhancements

    We’ve released updated Nova-3 Swedish and Nova-3 Dutch models, offering improved accuracy for both streaming and batch transcription.

    Access these models by setting model: "nova-3" and the relevant language code:

    • Swedish (sv, sv-SE)
    • Dutch (nl)

    Learn more about Nova-3 on the Models and Language Overview page.

    Original source Report a problem

Related vendors