Deepgram Release Notes

74 release notes curated from 128 sources by the Releasebot Team. Last updated: May 12, 2026

Get this feed:
  • May 12, 2026
    • Date parsed from source:
      May 12, 2026
    • First seen by Releasebot:
      May 12, 2026
    Deepgram logo

    Deepgram

    May 12, 2026

    Deepgram releases SDK updates across JavaScript, Rust, Python, and Java, adding Flux multilingual support, restoring the Agent interface, fixing WebSocket query handling, and improving reconnect behavior with new transport customization options and breaking API updates.

    SDK releases

    A new round of SDK updates is now available across JavaScript, Rust, Python, and Java. This release brings Flux multilingual support to Rust, restores the Agent interface in JavaScript, ships a Python bugfix for WebSocket query parameters, and delivers a breaking Java release with reconnect improvements.

    JavaScript SDK v5.2.0

    Deepgram JavaScript SDK v5.2.0 is now available. This release restores the Agent interface and adds AgentReference for string-ID flows, aliases AgentV1SettingsAgentListenProvider to AgentContextListenProvider, and preserves AgentV1Settings.Agent sub-types so existing agent code continues to compile.

    For release details, see deepgram-js-sdk v5.2.0.

    Rust SDK 0.10.0

    Deepgram Rust SDK 0.10.0 is now available. This release adds Flux multilingual support with Model::FluxGeneralMulti, OptionsBuilder::language_hint for BCP-47 language hints, and new TurnInfo fields (languages and languages_hinted). It also introduces mid-session reconfiguration via FluxHandle::configure(ConfigureRequest) for adjusting thresholds, keyterms, and language hints without restarting the WebSocket.

    This release includes a breaking change: FluxResponse::TurnInfo is now #[non_exhaustive].

    For release details, see deepgram-rust-sdk 0.10.0.

    Python SDK v7.1.1

    Deepgram Python SDK v7.1.1 is now available. This patch release fixes boolean query parameters on WebSocket connect, which are now lowercased to match what the API expects.

    For release details, see deepgram-python-sdk v7.1.1.

    Java SDK v0.4.0

    Deepgram Java SDK v0.4.0 is now available. This release ships reconnect and listener bug fixes, adds a transport factory policy hook for customizing transport behavior (timeouts, proxies, TLS) without subclassing the client, and incorporates the latest API surface updates.

    This release includes breaking changes. For the full release notes, see deepgram-java-sdk v0.4.0.

    Original source
  • May 12, 2026
    • Date parsed from source:
      May 12, 2026
    • First seen by Releasebot:
      May 12, 2026
    Deepgram logo

    Deepgram

    May 12, 2026

    Deepgram expands Nova-3 multilingual numerals support, converting spoken numbers to digits across more languages.

    Nova-3 Multilingual Model Update

    Numerals Support Expanded for Nova-3 Multilingual

    Numeral formatting is now supported for all Nova-3 multilingual languages — except Hindi and Japanese. This enhancement means Nova-3 multilingual can now convert spoken numbers to digits (e.g., “three hundred” → “300”) for English, Spanish, French, German, Russian, Portuguese, Italian, and Dutch.

    To use this feature, set model="nova-3" and language="multi". Then include the numerals=true parameter in your request.

    Learn more about how Numerals works and see supported languages on the Numerals page.

    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from Deepgram and hundreds of other software products.

    Create account
  • May 11, 2026
    • Date parsed from source:
      May 11, 2026
    • First seen by Releasebot:
      May 12, 2026
    • Modified by Releasebot:
      May 12, 2026
    Deepgram logo

    Deepgram

    May 11, 2026

    Deepgram releases the Browser Agent SDK, a new set of composable packages that connect web apps to the Voice Agent API, plus a simpler docs structure for Voice Agent. The SDK includes a drop-in widget, React components, a provider and hooks, and a framework-agnostic core.

    Browser Agent SDK

    The Browser Agent SDK is now available — four composable packages that connect any web app to the Voice Agent API:

    @deepgram/agents-widget — drop-in widget with six layouts (sidebar, floating, inline, button, embedded, or orb). No framework required.

    @deepgram/ui — pre-built React components (conversation view, animated orb, mic/speaker controls, waveform visualizer) styled through CSS custom properties.

    @deepgram/react — provider + hooks for state, conversation history, microphone control, audio playback, and client-side function calling.

    @deepgram/agents — the framework-agnostic core: WebSocket client, microphone capture, and player.

    Each layer builds on the one below it, so installing the higher layer pulls in everything beneath. All layers share the same reconnection logic, playback-aware mode tracking, audio buffering, optional Silero VAD, KeepAlive pings, and typed event emitter.

    Install the widget and ship in minutes:

    For the full architecture, package-by-package guides, and live in-page demos, see the Browser Agent SDK overview.

    Voice Agent docs restructure

    The Voice Agent section has been reorganized into five sections — Get Started, Build, Integrate, Reference, and Tips & Migration — to make it easier to find content based on where you are in your build. As part of the same pass, a few closely related reference pages have been merged (for example, prompt-updated, speak-updated, and think-updated are now consolidated into Acknowledgements, and the errors and warning pages are now Errors & Warnings). Redirects are in place, so existing links continue to work.

    Original source
  • May 5, 2026
    • Date parsed from source:
      May 5, 2026
    • First seen by Releasebot:
      May 6, 2026
    Deepgram logo

    Deepgram

    Build Voice Agents in Your AI Coding Tool

    Deepgram ships new voice AI developer tooling with the dg CLI, MCP server, and deepgram/skills repo, making its APIs easier to use in Claude Code, Cursor, Windsurf, Codex, and Aider. It speeds up voice agent setup, testing, and integration from the terminal.

    Build voice agents faster with the dg CLI, MCP server, and the deepgram/skills repo.

    Three agentic engineering tools that make Deepgram a first-class citizen in Claude Code, Cursor, Windsurf, Codex, and Aider.

    Voice AI builders have a new default development environment, and it isn't a browser tab. It's an AI coding tool. Claude Code, Cursor, Windsurf, Codex, Aider. The agent reads your repo, writes the integration, runs the tests, and ships the PR. The bottleneck is no longer typing speed. It's how well your AI coding tool understands the APIs you're trying to use.

    Most voice AI builders hit the same wall. The agent gets to the speech layer and stalls. It guesses at endpoint shapes. It hallucinates parameters. It writes against a curl example from two model versions ago. You end up pasting docs into the prompt, copy-pasting from the dashboard, or writing scaffolding by hand and letting the agent fill in the rest. Every voice agent integration eats more developer time and more agent tokens than it should at the part of the stack that should be the easiest.

    We fixed that.

    Three Tools Shipped Together

    In April we shipped three pieces of agentic engineering tooling that work together as one platform layer for voice AI builders.

    The dg CLI.

    A terminal interface for Deepgram with 25+ commands. Transcribe a file, a URL, a microphone, or a piped audio stream. Generate speech with Aura. Run text intelligence on a transcript. Manage projects, keys, members, and usage. Auto-detects Claude Code, Aider, and Codex and switches to JSON output and stderr-routed status without flags. UNIX-friendly by design, with structured stdout, proper exit codes, and pipe support. MIT license, Python 3.10+. Install at cli.deepgram.com.

    The MCP server.

    A built-in Model Context Protocol proxy that connects your AI coding tool to Deepgram's API. Start it with dg mcp. One tool surface, with the agent able to transcribe audio, generate speech, list models, manage projects, and more. Auth handled locally via dg login. Plugs into Claude Code, Cursor, Windsurf, or any MCP-aware tool.

    The deepgram/skills repo.

    Agent skills are markdown instruction folders that your AI coding tool loads on demand. Six product-level skills cover API reference (api), docs navigation (docs), runnable starter apps (starters), feature-specific recipes (recipes), third-party integrations (examples), and MCP setup (setup-mcp). Per-language SDK skills layer on top for Python, JavaScript/TypeScript, Java, Go, Rust, Swift, Kotlin, .NET, and browser TS. The CLI handles the core install for you, and a single command brings in the rest (more below).

    What You Can Build With Them

    A voice agent prototype before lunch.
    Pull a starter app with a skill. Wire it to your LLM. Pipe a test audio file through the CLI to confirm transcription. Generate speech for the agent's response. Iterate without leaving the terminal or your AI coding tool's context window.

    An integration that doesn't drift.
    Skills update with the product. Your AI coding tool reads the current API surface, not a stale model-trained guess. The recipes it pulls are real recipes, not hallucinations.

    A multi-language stack on one platform.
    The product-contract skills tell the agent what Deepgram does. The SDK skills tell it how to call Deepgram in your language. Same platform, same primitives, different language idioms.

    A faster eval-to-prototype loop.
    Transcribe a sample, generate speech, inspect a project, all from the same shell. The CLI is also useful when you want to test a hypothesis quickly without writing app code.

    How It Works

    Install the CLI:

    curl -fsSL deepgram.com/install.sh | sh
    

    Log in. The CLI detects which AI coding tools you have installed (Claude Code, Codex, Gemini CLI, Cursor, Cline) and offers to install the four core Deepgram skills (api, docs, starters, setup-mcp) into each.

    dg login
    

    For the full skill set, including recipes and integration examples, use the universal installer:

    npx skills add deepgram/skills
    

    Or for Claude Code natively, register the plugin marketplace:

    /plugin marketplace add deepgram/skills
    /plugin install deepgram@deepgram-agent-skills
    

    If you skip the dg login prompt or add a new AI coding tool later, run dg skills install to set up the core skills on demand.

    Transcribe a file:

    dg listen call.mp3 | jq '.results.channels[0].alternatives[0].transcript'
    

    Generate speech to your speaker:

    dg speak "Hello from Deepgram" | ffplay -nodisp -autoexit -
    

    Start the MCP server for your AI coding tool:

    dg mcp
    

    Your AI coding tool now has structured Deepgram knowledge loaded as skills, can pull the right starter for a given use case, and can call the API directly via MCP when you want it to.

    Get Started

    Install the CLI →

    Get the skills →

    CLI README and reference →

    The story for voice AI builders is no longer "Deepgram is an API you integrate." It's "Deepgram is a platform your AI coding tools already understand." That's what changes when the bottleneck moves from typing to context. We're going to keep building toward this. Skills are easy to extend, MCP capabilities are growing, and the CLI is going to get better the more we hear from builders. Tell us what you want next.

    Original source
  • Apr 30, 2026
    • Date parsed from source:
      Apr 30, 2026
    • First seen by Releasebot:
      May 1, 2026
    • Modified by Releasebot:
      May 2, 2026
    Deepgram logo

    Deepgram

    April 30, 2026

    Deepgram releases its April 2026 self-hosted update with Nova-3 Gujarati support, Aura-2 speed and pronunciation controls, and stronger Voice Agent capabilities. It also improves numeral formatting, multilingual tagging, and redaction accuracy.

    Deepgram Self-Hosted April 2026 Release (260430)

    Container Images (release 260430)

    quay.io/deepgram/self-hosted-api:release-260430

    Equivalent image to:

    quay.io/deepgram/self-hosted-api:1.185.0-2

    quay.io/deepgram/self-hosted-engine:release-260430

    Equivalent image to:

    quay.io/deepgram/self-hosted-engine:3.116.0-1

    Minimum required NVIDIA driver version: >=570.172.08

    quay.io/deepgram/self-hosted-license-proxy:release-260430

    Equivalent image to:

    quay.io/deepgram/self-hosted-license-proxy:1.10.1

    quay.io/deepgram/self-hosted-billing:release-260430

    Equivalent image to:

    quay.io/deepgram/self-hosted-billing:1.13.0

    Aura-2 Speed and Pronunciation Controls require an updated voice-pack

    The new Aura-2 Speed and Pronunciation Control features in this release are powered by an updated Aura-2 English voice-pack model. If your deployment is using an Aura-2 English voice-pack from before the April 2026 release (e.g., the 2025-04-15.0 version of the voice-pack), requests including the speed or pronounce parameters will return 400 Bad Request.

    To enable these features, contact your Deepgram representative to obtain the latest Aura-2 English voice-pack (2025-04-15.4 or later) and replace the existing voice-pack file in your models directory. The official Deepgram Helm chart and sample values files in deepgram/self-hosted-resources (chart 0.34.0 and later) already point to the correct UUID; you only need to use the latest Deepgram configuration files and update the model file on disk.

    This Release Contains The Following Changes

    • Nova-3 Gujarati — Nova-3 now supports Gujarati (gu) for both batch and streaming.
    • Aura-2 Speed and Pronunciation Controls — Aura-2 TTS voices now support runtime speed and pronunciation control. See Voice Controls for details.
    • Improved Aura-2 Pronunciation — Better pronunciation for Spanish dates and the term “Jan” (as a name versus a month) with Aura-2 voices.
    • Nova-3 Multilingual Numeral Formatting — Numeral formatting is now applied when using Nova-3 multilingual models and smart_format or numerals is enabled.
    • Numeral Formatting for Hebrew and Romanian — Numeral formatting is now applied for Nova-3 Hebrew (he) and Romanian (ro) when smart_format or numerals is enabled.
    • Voice Agent: Cartesia Speed Control — The Cartesia speak provider now supports speed control in Voice Agent sessions.
    • Voice Agent: Improved Agent Message Injection — Improved support for injecting agent messages into a live session. See Inject Agent for details.
    • Voice Agent: Multilingual Flux Language Hints — Multilingual Flux now accepts language hints when used as the STT provider in a Voice Agent session.
    • Improved Multilingual Streaming Language Tags — Improves the accuracy of language tag results on /v1/listen streaming requests using multilingual models.
    • Improved Numeral Redaction Accuracy — Improved redaction accuracy when using redact=numbers or redact=aggressive_numbers.
    • General Improvements — Keeps our software up-to-date.
    Original source
  • Apr 30, 2026
    • Date parsed from source:
      Apr 30, 2026
    • First seen by Releasebot:
      May 1, 2026
    Deepgram logo

    Deepgram

    Introducing Flux Multilingual: One Conversational Speech Model for Global Voice Agents

    Deepgram launches Flux Multilingual, a generally available conversational speech recognition model for voice agents in 10 languages. It brings real-time language detection, code-switching, turn detection, and interruption handling to a single API with low-latency streaming and cloud or self-hosted deployment.

    Conversational speech recognition across 10 languages in a single API. Monolingual-grade accuracy without per-language infrastructure or rebuilding your stack.

    Flux is the first conversational speech model built for voice agents in English, unifying turn detection, interruption handling, and transcription in a single real-time architecture.

    Today, it goes global.

    Introducing Flux Multilingual: one conversational STT model for real-time voice agents, now in 10 languages — English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch. Language detection, code-switching, turn detection, and interruption handling all run natively through a single streaming connection.

    Extending voice agents across languages no longer means stitching together monolingual models, routing layers, and detection logic. Developers build against one conversational model and a single API, with monolingual-grade accuracy across languages.

    If you’re already on Flux, it’s a one-line change: swap flux-general-en for flux-general-multi.

    Same API, same streaming semantics, same integration.

    Flux Multilingual is supported through partner integrations with Twilio, Vapi, LiveKit, Pipecat, and Jambonz.

    Start building today →

    Try it in the Playground →

    “Customers told us that Flux transformed what's possible for real-time voice AI agents in English. It stood to reason that Deepgram would solve this globally too. Our customers' teams no longer need to sacrifice accuracy with legacy multilingual systems, nor stitch multiple models with complex routing themselves. With Flux Multilingual, teams take the exact conversational experience they built for English and extend it across languages with a single system.” - Omar Paul, VP of Products, Twilio

    Build Once, Deploy Globally

    Traditional approaches to multilingual voice systems often rely on multiple components to approximate a single conversational experience, such as separate models, language handling, and system logic. Each new language adds additional infrastructure and system overhead, driving up latency and making systems harder to maintain.

    Teams are forced into a tradeoff: systems that prioritize accuracy often introduce latency, while systems optimized for speed struggle to maintain conversational performance in real time.

    Neither approach delivers the responsiveness and conversational performance required for production voice agents.

    Flux Multilingual collapses detection, routing, and per-language models into a single conversational model and a single streaming connection. Instead of managing a system of components, developers build against one model that handles speech recognition, turn detection, language detection, and code-switching in real time.

    There’s no routing logic to maintain, no orchestration layer to debug, and no tradeoff between conversational performance and global coverage.

    In the Voice Agent API, Flux Multilingual pairs with the new auto_language_detection setting: configure your TTS voices by language, and the agent routes to the matching voice based on what Flux detects. That's STT and TTS coordinating through a single API configuration, without a custom language-ID service or routing layer.

    “Flux Multilingual is a big step toward making global voice agents truly accessible to developers. Instead of managing multiple models and infrastructure, teams can extend the same real-time conversational experience across languages within a single system. With voice capabilities powered by Deepgram, that’s exactly the kind of simplicity developers need to scale.” - Vijay Raghunathan, VP of Engineering, Vapi

    “Scaling customer operations across languages in financial services often requires additional systems, routing logic, and operational oversight. In regulated environments, that quickly becomes difficult to manage. Flux Multilingual makes it possible to extend those workflows across markets without re-architecting the stack.” - Danai Antoniou, Chief Scientist and Co-Founder, Gradient Labs

    "For teams building and deploying voice applications, maintaining control without introducing additional complexity has always been a challenge. Multilingual systems often force tradeoffs between latency, accuracy, and operational overhead. Flux Multilingual changes that by enabling conversational speech recognition that works across languages in real time.” - Dave Horton, Founder, Jambonz

    Monolingual-Grade Accuracy with Real-Time Language Control

    Flux Multilingual delivers accuracy that was previously only achievable with dedicated monolingual models.

    To evaluate performance, we benchmarked Flux Multilingual on real-world production audio across all supported languages, measuring word error rate (WER) using each vendor’s default streaming configuration. Each system processed complete, unmodified audio to reflect how these models perform in production environments.

    Across these benchmarks, Flux Multilingual delivers best-in-class WER across the majority of supported languages on real-world production audio, including English, Spanish, German, French, Portuguese, and Hindi. Per-language results are shown below.

    Flux Multilingual gives developers precise control over how language is handled.

    At the center is a single parameter: language_hint

    A language hint tells the model which language to expect, delivering accuracy that was previously only achievable with monolingual models. Providing multiple language hints narrows the search space for multilingual environments, like global contact centers, while still allowing the model to switch between languages in real time.

    This enables four core patterns:

    • Known language → maximize accuracy by hinting a single language
    • Known language set → constrain detection without routing logic
    • Unknown language(s) → let Flux automatically detect
    • Mixed-language conversations → native code-switching

    A caller might say:

    “I need help with my cuenta.” — switching from English to Spanish within the same sentence (“cuenta” meaning “account”)

    With traditional systems, that interaction often leads to errors or degraded performance. Language detection must resolve to a dominant language, and systems may misclassify mixed-language input or fall back, introducing errors, latency, or both.

    With Flux Multilingual, this is handled natively. There’s no model switching, no detection errors, and no need for additional system logic.

    Each turn of the response returns a languages array identifying which languages were detected, providing per-turn granularity that competitors only offer at the utterance level.

    When the language is unknown, Flux Multilingual automatically detects it from audio and continues to adapt in real time, even mid-sentence.

    Language behavior isn’t fixed at connection time.

    Language settings are reconfigurable mid-conversation without reconnecting. This allows systems to detect a caller’s language on the first turn, then optimize for it across the rest of the interaction.

    "Evaluating multilingual voice agents consistently, especially in code-switching scenarios, has been one of the hardest problems. Flux Multilingual gives us one conversational model, making it easier to test, validate, and trust performance at scale.” - Brooke Hopkins, Founder, Coval

    “The difference between a good assistant and a great one is how well it keeps up with real conversations (not robotic ones). In multilingual settings that usually breaks down, which means missed utterances, language switching, and the natural flow of the conversation dies. Flux Multilingual keeps it going, just like talking to a friend or colleague.” - Lindy Drope, Head of Sales, Lindy AI

    Ultra-Low Latency Conversational Speech Recognition

    Multilingual systems have historically forced tradeoffs between latency, accuracy, and conversational flow. Improving one of these dimensions often comes at the expense of another. Faster systems sacrifice accuracy, while more accurate systems introduce delay. Multilingual systems struggle to maintain both.

    Flux Multilingual eliminates those tradeoffs.

    • Interruption handling is native, preserving natural conversational flow across languages
    • Streaming latency stays consistently low for real-time interaction in multilingual scenarios
    • Accuracy remains on par with dedicated single-language models, across every supported language

    This allows developers to extend conversational systems globally without degrading responsiveness, accuracy, or user experience.

    To evaluate conversational performance, we benchmarked end-of-turn (EoT) accuracy and latency on real-world production audio across all supported languages, measuring F1 score and median latency under each vendor’s default or recommended configurations. This captures how well each system determines when a speaker has finished speaking, a critical requirement for real-time voice agents.

    Across these benchmarks, Flux Multilingual delivers:

    • Highest aggregate EoT F1 across all supported languages
    • Up to 3x lower latency than competing real-time EoT systems

    This advantage comes from how turn detection is handled. Instead of relying on silence thresholds, Flux uses a learned confidence signal that understands conversational context. Because it's a confidence signal, Flux exposes it as a tunable threshold, letting developers dial toward faster responses or more conservative end-of-turn decisions per use case.

    Competitors relying on silence-based approaches face a tradeoff between speed and precision. Reducing latency increases false triggers, while improving precision introduces delay. Flux avoids this tradeoff, delivering consistently strong performance across all supported languages.

    For voice agents, this is the difference between reacting to silence and understanding conversation.

    The full results are shown below.

    “Handling conversation flow correctly is one of the hardest parts of building voice agents, and it only gets more complex across languages. With real-time voice capabilities powered by Deepgram, extending turn-aware conversational behavior into multilingual interactions gives developers a much stronger foundation to build on.” - Kwindla Hultman Kramer, CEO at Daily

    “In financial services, every conversation carries real consequences, which makes accuracy and consistency non-negotiable. Flux Multilingual gives us more control over language behavior while adapting in real time, which is critical for delivering reliable multilingual experiences at scale.” - German Attanasio, CTO, Moveo.AI

    Deployment: Cloud API or Self-Hosted

    Flux Multilingual is available in two deployment modes, using the same API and integration model:

    Cloud API (Deepgram-hosted)

    • Fastest way to get started
    • Fully managed infrastructure with global and regional endpoints
    • Ideal for most production voice applications

    Self-Hosted (customer-operated)

    • Run Flux Multilingual in your own environment
    • Audio never leaves your infrastructure
    • Designed for strict data residency, privacy, security, or latency requirements

    Both deployment options use the same API, streaming semantics, and SDKs.

    Try Flux Multilingual Today

    Flux Multilingual is now generally available. The model supports English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch. Additional languages will continue to be added in future releases.

    As part of the launch, we’re offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.

    Flux Multilingual is supported through partner integrations with Twilio, Vapi, LiveKit, Pipecat, and Jambonz.

    Get started with Flux Multilingual via the Deepgram API for streaming speech-to-text, or through the Voice Agent API for end-to-end voice agents.

    Already using Flux EN? Change your model to flux-general-multi. Same API, same integration.

    Start building today:

    • See the live demo →
    • Try it in the Playground →
    • Get started with the API →
    • Build end-to-end with the Voice Agent API →
    • Sign up for an API key →
    • Explore the documentation →
    • Join our Discord community →

    Conversational STT for voice agents, now in every language your customers speak. Build globally with Flux Multilingual, without rebuilding your stack.

    Original source
  • Apr 29, 2026
    • Date parsed from source:
      Apr 29, 2026
    • First seen by Releasebot:
      May 1, 2026
    Deepgram logo

    Deepgram

    April 29, 2026

    Deepgram adds GPT-5.5 support in Voice Agent API, expands Cartesia TTS speed control with presets and numeric values, and removes the Llama Nemotron Super 49B model from NVIDIA due to poor performance.

    LLM Model Updates & Cartesia Speed Control

    GPT-5.5 LLM Model Support

    OpenAI’s GPT-5.5 model is now available as a managed LLM in the Voice Agent API. GPT-5.5 is an Advanced tier model.

    Set the model in your agent configuration:

    {
      "agent": {
        "think": {
          "provider": {
            "type": "open_ai",
            "model": "gpt-5.5",
            "temperature": 0.7
          }
        }
      }
    }
    

    For the full list of supported models and pricing tiers, see the Voice Agent LLM Models documentation.

    Llama Nemotron Super 49B Removed

    The llama-nemotron-super-49B (Llama Nemotron Super 49B) model has been removed from the NVIDIA provider due to poor performance. The nemotron-3-nano-30B-A3B model remains available. See the NVIDIA models table for current options.

    Cartesia TTS Speed Control

    The agent.speak.provider.speed parameter now supports Cartesia TTS in addition to Deepgram TTS. For Cartesia, the parameter accepts the following preset values:

    slowest

    slow

    normal

    fast

    fastest

    You can also pass a numerical value for more granular control. See the Cartesia speed documentation for details.

    For more details, see the TTS Models documentation.

    Original source
  • Apr 29, 2026
    • Date parsed from source:
      Apr 29, 2026
    • First seen by Releasebot:
      May 1, 2026
    Deepgram logo

    Deepgram

    Flux Multilingual Technical Deep Dive: Multilingual Speech-to-Text Without the Routing Mess

    Deepgram introduces Flux Multilingual, a real-time streaming STT model for 10 languages with one WebSocket connection, automatic language detection, native code-switching, and the new language_hint parameter for biased multilingual recognition and per-turn language output.

    How Deepgram Flux Multilingual and its new language_hint parameter collapse multilingual voice infrastructure into a single real-time streaming connection.

    TL;DR: Deepgram Flux Multilingual (flux-general-multi) provides real-time streaming speech-to-text for 10 languages with language_hint biasing, automatic language detection via TurnInfo, and native code-switching. One model, one WebSocket connection.

    If your real-time speech-to-text pipeline handles more than one language, you've probably built this stack: a language detection service, per-language models, and routing logic to connect them. Unlike approaches that require a separate model per language with external routing, Deepgram Flux Multilingual (flux-general-multi) handles all ten languages in a single streaming connection with automatic language detection, native code-switching, and a new language_hint parameter that biases detection toward languages you expect. If you're new to Deepgram, Flux is our real-time conversational STT model built for voice agents. It comes with turn detection, interruption handling, and barge-in awareness out of the box. Grab a free API key and follow along.

    Step 1: Connect to Deepgram Flux Multilingual

    If you're already using the Deepgram Python SDK with Flux, this is a one-line change: swap flux-general-en for flux-general-multi in your model parameter. If you're starting fresh, install the SDK (pip install deepgram-sdk) and connect:

    from deepgram import DeepgramClient
    from deepgram.core.events import EventType
    
    client = DeepgramClient("your-api-key")  # or set DEEPGRAM_API_KEY env var
    
    # Connect to Flux Multilingual — same API, new model
    connection = client.listen.v2.connect(
        model="flux-general-multi",
        encoding="linear16",
        sample_rate=16000,
    )
    

    That's the same client.listen.v2.connect() you already use for Flux. The model name is the only change. Turn detection, interruption handling, barge-in all carry over from Flux.

    If you prefer working with the WebSocket directly (or you're using a language without an official SDK), the raw endpoint works too:

    wss://api.deepgram.com/v2/listen?model=flux-general-multi&encoding=linear16&sample_rate=16000
    

    All the code samples below use the Python SDK. The JavaScript and Java SDKs support the same multilingual streaming features as the Python SDK—same language_hints configuration and the same TurnInfo.languages / TurnInfo.languages_hinted outputs.

    Step 2: Choose your language_hint strategy

    This is a design decision that matters most in your integration. language_hint is a new optional parameter. It is an array of BCP-47 language codes that biases the model toward the languages you expect. Think of it as a prior, not a hard constraint. It narrows the model's hypothesis space without locking it in. If someone speaks a language outside your hint set, the model can still detect it.

    The right strategy depends on what your application knows when the connection opens.

    If you know the language meaning the caller selected it, or this is a single-language queue:

    connection.send_configure(language_hints=["es"])
    

    Single-hint accuracy is close to a dedicated monolingual model. This is the highest-accuracy configuration you can set.

    If you know the language candidate: for example, a support desk handling English, Spanish, and French:

    connection.send_configure(language_hints=["en", "es", "fr"])
    

    The model narrows its search space to your expected set but can switch between them turn-by-turn as the caller does.

    If you don't know: for example, a globally available, unpredictable caller base:

    Automatic language detection works across all ten supported languages. Omit language_hint entirely and Flux Multilingual auto-detects. You trade a small accuracy margin for zero configuration.

    You expect code-switching: bilingual callers mixing languages in the same sentence:

    connection.send_configure(language_hints=["en", "es"])
    

    This is the scenario that usually breaks per-language routing entirely. A caller says "I need help with my cuenta". Those are two languages in one utterance. Flux Multi handles this natively because it's one multilingual model, not separate models stitched together. There's no routing decision to get wrong.

    A good rule of thumb: if your application has signal about likely languages at connection time (user selection, queue config, account locale), hint them. If it doesn't, omit hints and lean on per-turn detection.

    Step 3: Read per-turn language detection from TurnInfo

    This is where the architecture really simplifies. Instead of running a separate language detection service before your real-time STT pipeline, Deepgram Flux Multilingual tells you what language it detected on every turn as a first-class field in the transcription output.

    Every TurnInfo event now includes two new fields:

    • languages — BCP-47 codes for all languages detected in that turn, sorted by how much of the turn was in each language (primary language first). When there's no transcript, this is empty.
    • languages_hinted — the hint set that was active when this turn was processed. This is useful for debugging when behavior isn't what you expected.

    Here's what it looks like when you wire it up:

    def on_turn_info(event):
        if event.event == "EndOfTurn" and event.languages:
            primary_lang = event.languages[0]
            print(f"[{primary_lang}] {event.transcript}")
    
        # Use the detected language to drive downstream decisions
        if primary_lang == "es":
            switch_tts_voice("es")
            update_llm_prompt(language="es")
    
    connection.on(EventType.TURN_INFO, on_turn_info)
    

    The raw TurnInfo JSON looks like this if you're working at the WebSocket level:

    {
      "type": "TurnInfo",
      "event": "EndOfTurn",
      "transcript": "I need help with my cuenta",
      "languages_hinted": ["en", "es"],
      "languages": ["en", "es"],
      "words": [
        {"word": "I", "confidence": 0.98},
        {"word": "need", "confidence": 0.96},
        {"word": "help", "confidence": 0.97},
        {"word": "with", "confidence": 0.95},
        {"word": "my", "confidence": 0.97},
        {"word": "cuenta", "confidence": 0.93}
      ],
      "end_of_turn_confidence": 0.86
    }
    

    languages[0] is the primary detected language, which is the one spoken most in that turn. You can use it to switch a TTS voice, update an LLM system prompt, or route to a language-specific agent queue.

    One thing to watch for in production: short utterances and noisy audio can sometimes produce one-off language detections. If you're making consequential decisions based on language (switching an agent, changing a voice), it's worth requiring the same languages[0] for 2-3 consecutive turns before committing. That gives you stability without losing the ability to react when a caller genuinely switches.

    Step 4: Reconfigure language_hint mid-stream

    Language hints aren't a one-time setting. If your application learns something after the stream opens, you can update hints without closing the connection:

    # Caller started in English, now clearly speaking French
    connection.send_configure(language_hints=["fr"])
    

    This is really useful for scenarios like a caller selecting a language in an IVR menu, or having the first few turns stabilize on a language you didn't expect. A non-empty array replaces the current hints. An empty array [] clears them and returns to full auto-detect. Omitting the field leaves hints unchanged, which is useful when you're adjusting other streaming parameters (like end-of-turn thresholds) without touching language settings.

    If a reconfiguration fails, you'll get a ConfigureFailure event. Because it's non-fatal, your stream keeps running. You can log it and decide whether to retry or fall back to a default configuration.

    Step 5: Handle the two language_hint errors that fail silently

    There are two specific errors to design for that return 400 responses.

    Unsupported language code.

    The initial release supports ten languages. If you pass a code outside that set, you'll get a 400 error.

    language_hint on the wrong model.

    Hints only work with flux-general-multi. If a hint configuration gets applied to a flux-general-en connection, it returns a 400:

    {
      "code": "INVALID_PARAMETER",
      "description": "language_hint is not supported for model flux-general-en"
    }
    

    Both are straightforward to handle. Surface them in your logs and monitoring so they don't silently degrade your multilingual experience.

    What if you're migrating from a multi-model setup?

    If you're running the detection-then-routing architecture described at the top, there is a migration for you. The detection service can be replaced by TurnInfo.languages, the per-language models can be replaced by flux-general-multi, and the routing logic simplifies to reading a field from the transcription events you're already processing.

    One thing that doesn't change: flux-general-en is still available. If your application is English-only, keep using it.

    Start building with Deepgram Flux Multilingual

    Sign up for a free Deepgram API key, set your model to flux-general-multi, and start streaming.

    → API reference — language_hint and TurnInfo fields
    → Python SDK | JS SDK | Java SDK

    Original source
  • Apr 29, 2026
    • Date parsed from source:
      Apr 29, 2026
    • First seen by Releasebot:
      Apr 29, 2026
    Deepgram logo

    Deepgram

    Deepgram Launches Flux Multilingual: The World’s First Multilingual Conversational Speech Recognition Model

    Deepgram releases Flux Multilingual, a generally available real-time conversational speech recognition model for voice agents that supports 10 languages, auto-detects and switches languages mid-conversation, and brings low-latency turn handling with monolingual-grade accuracy.

    One model, ten languages, and monolingual-grade accuracy for voice agents worldwide.

    SAN FRANCISCO (April 29, 2026)

    Deepgram, the real-time AI infrastructure company underpinning the Voice AI economy, today announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition model beyond English to support 10 languages, with the ability to automatically detect, understand, and switch languages dynamically within a single conversation in real time. Developers, enterprises, and product teams building voice agents now have access to the first real-time conversational speech recognition model, delivering accurate turn-taking, interruption handling, low latency, and natural human-like conversations at global scale.

    Traditional automatic speech recognition (ASR) is designed for transcription. Flux introduced a new approach, conversational speech recognition (CSR), built from the ground up to understand dialogue flow and enable real-time interaction. Flux has rapidly become foundational infrastructure for real-time voice agents, powering production systems that developers trust to deliver fast, natural conversational experiences with best-in-class accuracy in turn detection and speech recognition. Prior to today’s release, extending these experiences across multiple languages required stitching together multilingual transcription models, language detection, and routing logic, introducing latency, complexity, and brittle user experiences. Flux Multilingual replaces that complexity with a single model and API, making it possible to build conversational voice agents across 10 languages without re-architecting systems or sacrificing performance.

    With native support for turn-taking, interruptions, and code-switching within a single interaction, voice applications remain fluid, responsive, and natural regardless of language or region. Flux Multilingual delivers monolingual-grade accuracy across languages. Developers can guide the model with language hints or let it auto-detect, adapting in real time even mid-conversation.

    "Voice AI agents will soon become the default for how global enterprises interact with customers," said Scott Stephenson, CEO and Co-Founder, Deepgram. "Today is a major step forward towards that future. Flux Multilingual gives developers a single perception model to build global voice agents, with the ability to switch language mid-call. Now, enterprises can deliver the same seamless experience to any customer, in any market. Deepgram is the leader in real-time AI infrastructure, and Flux Multilingual is the latest in our suite of capabilities that enables developers to deliver real-time products across the globe."

    "Customers told us that Flux transformed what's possible for real-time voice AI agents in English," said Omar Paul, Vice President of Products, Twilio. "It stood to reason that Deepgram would solve this globally too. Our customers' teams no longer need to sacrifice accuracy with legacy multilingual systems, nor stitch multiple models with complex routing themselves. With Flux Multilingual, teams take the exact conversational experience they built for English and extend it across languages with a single system."

    Flux Multilingual Capabilities

    Supported Languages

    English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch

    Ultra-low latency conversational speech recognition, now global

    Flux Multilingual is built for understanding and interaction, not just transcription. It uses model-based turn detection, not simple silence detection, to deliver accurate end-of-turn decisions in under 400 milliseconds, keeping conversations fluid and responsive across languages.

    Monolingual-grade accuracy with real-time language control

    Flux Multilingual delivers monolingual-grade accuracy across languages, with flexible real-time control through language hints or automatic detection, native code-switching, and dynamic adaptation as conversations evolve.

    Build and scale global voice agents with one model

    Flux Multilingual supports 10 languages in a single conversational model, enabling teams to build and deploy voice agents globally with one integration. One model, ten languages, one API, with no additional infrastructure or model orchestration required.

    Key Features

    • Native turn detection and interruption handling for natural dialogue flow
    • Low-latency streaming transcription for real-time responsiveness
    • Automatic language detection and language hint support for accuracy control
    • Mid-session configurability for dynamic language adaptation
    • Native code-switching within a single conversation
    • Fully compatible with existing Flux API integrations

    Flux Multilingual is now generally available (GA). As part of the launch, Deepgram is offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.

    Flux Multilingual is available via Deepgram’s Cloud API or as a self-hosted deployment, with support for EU endpoints, SDKs, and seamless integration into voice agent architectures. Developers can get started today at deepgram.com or try Flux Multilingual directly in the Deepgram Playground.

    About Deepgram

    Deepgram is the real-time AI infrastructure company underpinning the Voice AI economy. Today, more than 200,000 developers and 1,400 organizations are Powered by Deepgram. Its voice AI platform offers speech-to-text (STT), text-to-speech (TTS), and full speech-to-speech (STS) capabilities, all powered by its enterprise-grade runtime. Deepgram’s voice-native foundation models, accessed through cloud APIs or as self-hosted/on-premises APIs, deliver unmatched accuracy, low latency, and competitive pricing. Customers include technology ISVs building voice products or platforms, co-sell partners working with large enterprises, and enterprises solving internal use cases. Having processed over 50,000 years of audio and transcribed over 1 trillion words, there is no organization in the world that understands voice better than Deepgram. To learn more, please visit www.deepgram.com, read its developer docs, or follow @DeepgramAI on X and LinkedIn.

    Original source
  • Apr 23, 2026
    • Date parsed from source:
      Apr 23, 2026
    • First seen by Releasebot:
      Apr 24, 2026
    Deepgram logo

    Deepgram

    April 23, 2026

    Deepgram adds Gujarati support to Nova-3 for speech recognition with gu and gu-IN language codes.

    Nova-3 Model Update

    🌏 Nova-3 now supports Gujarati with the following language codes:

    Gujarati: gu, gu-IN

    Access this model by setting model="nova-3" and the relevant language code in your request.

    Learn more about Nova-3 and supported languages on the Models and Language Overview page.

    Original source
  • Apr 16, 2026
    • Date parsed from source:
      Apr 16, 2026
    • First seen by Releasebot:
      Apr 17, 2026
    Deepgram logo

    Deepgram

    April 16, 2026

    Deepgram adds Flux Multilingual for self-hosted deployments, bringing real-time multilingual conversational STT with code-switching support across 10 languages. The release also keeps the software stack up to date.

    Container Images (release 260416)

    quay.io/deepgram/self-hosted-api:release-260416

    Equivalent image to:

    quay.io/deepgram/self-hosted-api:1.181.7

    quay.io/deepgram/self-hosted-engine:release-260416

    Equivalent image to:

    quay.io/deepgram/self-hosted-engine:3.115.1

    Minimum required NVIDIA driver version:

    =570.172.08

    quay.io/deepgram/self-hosted-license-proxy:release-260416

    Equivalent image to:

    quay.io/deepgram/self-hosted-license-proxy:1.10.1

    quay.io/deepgram/self-hosted-billing:release-260416

    Equivalent image to:

    quay.io/deepgram/self-hosted-billing:1.13.0

    This Release Contains The Following Changes

    Flux Multilingual — Real-time multilingual conversational STT is now available for self-hosted deployments. Supports 10 languages (English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch) with code-switching support. Deploying Flux Multilingual requires setting model_name = "flux-general-multi" in the [flux] section of engine.toml . See Flux Multilingual for details.

    General Improvements — Keeps our software up-to-date.

    Original source
  • Apr 15, 2026
    • Date parsed from source:
      Apr 15, 2026
    • First seen by Releasebot:
      Apr 17, 2026
    • Modified by Releasebot:
      Apr 24, 2026
    Deepgram logo

    Deepgram

    April 15, 2026

    Deepgram releases its CLI, bringing transcription, speech synthesis, text analysis, account management, and MCP tooling to the terminal through a single dg command. It lets users transcribe audio, generate speech, manage projects and API keys, and start an MCP server.

    Deepgram CLI Is Now Available

    The Deepgram CLI brings transcription, speech synthesis, text analysis, account management, and MCP tooling to your terminal through a single dg command.

    What you can do

    Use the dg CLI to work with Deepgram from your terminal:

    • Transcribe files, URLs, microphone input, and piped audio
    • Generate speech with Deepgram Aura voices
    • Run text analysis workflows such as summarization, sentiment, and topic detection
    • Manage projects, API keys, members, and usage
    • Start an MCP server for AI coding tools

    Launch docs

    Start here:

    • CLI Getting Started
    • CLI Installation
    • CLI Authentication
    • MCP Server

    For the launch site and quick reference, visit cli.deepgram.com.

    Original source
  • Apr 15, 2026
    • Date parsed from source:
      Apr 15, 2026
    • First seen by Releasebot:
      Apr 16, 2026
    Deepgram logo

    Deepgram

    April 15, 2026

    Deepgram adds Flux Multilingual, a general-availability conversational STT model that supports 10 languages with turn-aware intelligence, auto language detection, code-switching, language hints, and mid-stream reconfiguration for real-time voice agents and global contact centers.

    Flux Multilingual: Conversational STT, Now in 10 Languages

    The same model that solved turn detection for English voice agents now works everywhere your customers speak — no language routing, no model-per-language infrastructure, no accuracy tradeoff.

    Deepgram is proud to announce the general availability of Flux Multilingual (flux-general-multi), a single model supporting 10 languages with the same turn-aware, interruption-aware conversational intelligence as flux-general-en.

    Key Features

    • 10 languages, one model — English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch. No language routing or model-per-language infrastructure required.
    • Language prompting — The optional language_hint parameter biases the model toward specific languages, delivering accuracy on par with dedicated monolingual models. Without hints, the model auto-detects the spoken language.
    • Native code-switching — Handles mid-sentence language switches without configuration changes or reconnections.
    • Language detection on every turn — All TurnInfo events include a languages field reporting detected languages sorted by word count, and a languages_hinted field reflecting the active hints.
    • Mid-stream reconfiguration — Update language hints during a stream using the Configure control message without disconnecting. Supports patterns like detect-then-lock for optimal accuracy.
    • Same Flux architecture — All turn detection, eager end-of-turn, and configuration parameters from flux-general-en work identically.

    Use Cases

    Designed for non-English monolingual voice agents, multilingual voice agents, global contact centers, bilingual support lines, and any real-time conversational application where callers may speak different languages. For English-only workloads, continue using flux-general-en.

    Getting Started

    Connect to Flux Multilingual by setting model=flux-general-multi on the /v2/listen endpoint — no new credentials or endpoints required. Pricing is the same as flux-general-en.

    Learn more in the Language Prompting guide, Flux Quickstart, and API Reference.

    Availability

    Flux Multilingual is now available through our API. To access:

    • Connect to wss://api.deepgram.com/v2/listen using model=flux-general-multi
    • EU endpoint available: wss://api.eu.deepgram.com/v2/listen?model=flux-general-multi
    • Real-time streaming only
    • SDK and self-hosted support coming soon
    Original source
  • Apr 10, 2026
    • Date parsed from source:
      Apr 10, 2026
    • First seen by Releasebot:
      Apr 14, 2026
    Deepgram logo

    Deepgram

    We Built a Slack Bot That Customers Can Install Themselves

    Deepgram launches a self-service Slack bot that answers customer questions from docs, transcribes audio, analyzes screenshots, checks service status, and looks up request IDs. It brings faster support with a more natural, contextual Slack experience.

    I'll be honest with you — this started because we were drowning. Not in a dramatic way. Just the slow, persistent kind where the same seven questions show up in your support channels every single day, and you watch your team copy-paste the same doc links over and over until their souls leave their bodies.

    So we built a bot. Not the "hello, I'm a chatbot, please describe your issue" kind. The kind that actually reads your docs, understands the question, searches for the right answer, and replies like someone who's been working at the company for years. Then we made it self-service — any customer can install it into their own Slack workspace and start getting answers immediately.

    The bot sits in Slack. You @mention it with a question about Deepgram — anything from “how do I set up speech-to-text in Node?” to “why is my request returning a 400?” — and it goes off, searches through our documentation, and comes back with a proper answer. Not a link dump. An actual considered response with context.

    It can also handle attachments. Drop an audio file in the thread and it’ll transcribe it using Deepgram’s speech-to-text API. Send a screenshot of an error and it’ll analyse the image and help you debug. It’s genuinely useful in ways that surprised even us.

    The tools it has access to:

    • Doc search — semantic retrieval across all of Deepgram’s documentation
    • Audio transcription — drag and drop audio or video files, get a transcript back
    • Image analysis — screenshots, error messages, architecture diagrams
    • Service status — checks for any ongoing incidents at Deepgram
    • Request lookup — paste a request ID and the bot will pull up the details from your project

    That last one is the one that really changes things. A customer can paste a request ID from their logs, and the bot will look it up against their linked Deepgram project and tell them exactly what happened. No more “can you send me your request ID so I can look it up on my end.”

    The self-service bit

    This is the part I’m most pleased with. We didn’t want to manage installations. We didn’t want a spreadsheet of workspace IDs. We wanted customers to be able to install the bot themselves, authenticate with their Deepgram account, and start using it — all without anyone from our team being involved.

    The flow: click “Add to Slack”, accept the permissions, log in to Deepgram, pick your project, done. One continuous flow — no separate setup page, no configuration. The bot links your Slack workspace to your Deepgram project and it’s ready to go.

    How the AI loop works

    Under the hood, the bot uses Anthropic’s Claude as an orchestrator. This isn’t a simple “send question, get answer” setup. It’s an agentic loop where Claude decides which tools to call based on the question.

    A typical interaction looks something like:

    • User asks “why is my STT request failing?”
    • Claude decides it needs more context — calls the doc search tool
    • Gets back relevant documentation about common error codes
    • Claude decides that’s not quite enough — maybe it should check service status
    • Checks for incidents — all clear
    • Puts it all together and responds with a diagnosis and solution

    The loop runs up to 10 iterations, and the bot updates Slack’s native status indicator as it goes — “is searching docs…”, “is checking service status…” — so the user knows something’s happening. When the response lands, tool calls are listed as context blocks underneath so you can see exactly what the bot did to arrive at its answer.

    The Slack experience

    We spent a fair bit of time getting the Slack UX right. Some things that matter:

    • Thread participation. You @mention the bot once to start a conversation. After that, it automatically participates in the thread — no need to keep tagging it. If someone @mentions a human in the thread, the bot stays out of the way.
    • Edit and delete handling. If you edit your message while the bot is still thinking, it aborts the in-flight request and re-processes with the updated text. Delete your message? It aborts and cleans up. This sounds minor but it makes the bot feel responsive rather than stupid.
    • Native thinking indicator. Instead of posting a “Thinking…” message and then deleting it (which looks janky), we use Slack’s native status API. You see “Deepgram Devs is searching docs…” in the thread, and it auto-clears when the response arrives.
    • Long responses don’t collapse. Slack has this annoying habit of collapsing long messages behind a “read more” fold. We split responses across multiple Block Kit sections, each under the collapse threshold, so the full answer is always visible. Tool call details and metadata sit in context blocks underneath — they never contribute to the fold.

    What’s next

    We’re continuing to iterate on this. The feedback buttons let users tell us when a response was helpful or not, which helps us tune the prompt and understand where the docs have gaps. We’re also looking at expanding the tool set — the model-context-protocol pattern makes it straightforward to add new capabilities without touching the core loop.

    If you’re a Deepgram customer and want to try it, the bot is available for installation in your Slack workspace. If you’re building something similar for your own product — honestly, the architecture is simpler than you’d think. An LLM orchestrator with tool definitions, a few API integrations, and Slack’s Bolt framework gets you surprisingly far.

    Original source
  • Apr 8, 2026
    • Date parsed from source:
      Apr 8, 2026
    • First seen by Releasebot:
      Apr 9, 2026
    Deepgram logo

    Deepgram

    April 8, 2026

    Deepgram adds reusable agent configurations through the API, letting teams store, manage, and reference agent setups by UUID instead of resending full WebSocket configs. It also introduces template variables for reusable runtime values, supporting customization, compliance, A/B testing, and multi-agent workflows.

    Reusable agent configurations

    You can now store and manage agent configurations and template variables through the Deepgram API. Instead of sending a full agent configuration with every WebSocket session, define it once and reference it by UUID.

    Key use cases include:

    • Per-customer configurations — Give each customer a distinct voice, persona, or model without maintaining separate codebases.
    • Regional and regulatory compliance — Maintain separate configurations for different markets to enforce data-handling, language, or disclosure requirements.
    • A/B testing voices or prompts — Run two configurations in parallel and measure conversion, CSAT, or containment rate without a code deploy.
    • Multi-agent architectures — Store and manage all agents used in your multi-agent architecture from a single project.

    Template variables let you define reusable values (such as system prompts or model names) that are automatically interpolated at runtime. Variables follow the DG_ naming format.

    For more details, see Reusable Agent Configurations.

    Original source
Releasebot

Curated by the Releasebot team

Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.

Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.

Similar to Deepgram with recent updates: