AI Voice and Speech Release Notes

Release notes for AI voice synthesis, text-to-speech and audio generation tools

Products (7)

Latest AI Voice and Speech Updates

  • Mar 5, 2026
    • Date parsed from source:
      Mar 5, 2026
    • First seen by Releasebot:
      Mar 6, 2026
    Murf logo

    Murf

    March 5, 2026

    TTS updates deprecate multiNativeLocale; switch to locale across all endpoints to keep compatibility.

    Deprecation of multiNativeLocale field

    We have deprecated the multiNativeLocale field across all Text-to-Speech (TTS) endpoints, including:

    • Non-Streaming API
    • Streaming API
    • WebSockets

    The multiNativeLocale field has been replaced by the locale field. We recommend updating your integrations to use the locale field to ensure continued compatibility with future updates.

    What's Changed?

    • Deprecated: multiNativeLocale
    • New Field: locale
    Original source Report a problem
  • Mar 2, 2026
    • Date parsed from source:
      Mar 2, 2026
    • First seen by Releasebot:
      Mar 2, 2026
    Eleven Labs logo

    Eleven Labs

    March 2, 2026

    ElevenAgents releases a major upgrade with new widget options, workflow say node, foldered tests, SMB tool config, summary language, call recording, messaging flags, pronunciation and music features, voices bookmarking, plus SDK and widget package updates for broader capabilities.

    ElevenAgents

    New widget configuration options: Several new settings are now available for the ElevenAgents conversation widget in the agent dashboard:
    Collapsible widget (widget.dismissible): Allow users to minimize or dismiss the widget during a conversation.
    Action indicator (widget.show_agent_status): Display a visual indicator when the agent is actively using tools, with distinct states for working, done, and error. Enabling this setting automatically adds agent_tool_request and agent_tool_response to the agent's client events.
    Conversation ID display (widget.show_conversation_id): Show the conversation ID to users after a call ends. Defaults to true for both new and existing agents.
    Audio tag visibility (widget.strip_audio_tags): Hide audio tags from conversation transcripts. Defaults to true for both new and existing agents.
    Syntax highlighting theme (widget.syntax_highlight_theme): Configure code block highlighting in transcripts. Accepts null (Auto), light, or dark.
    Folder-aware agent testing: The List agent tests endpoint now supports organizing tests into folders. Added parent_folder_id, include_folders, and sort_mode query parameters for filtered listing. Test creation and update endpoints accept parent_folder_id to assign tests to folders, and test summaries now include entity type, parent folder path, and children count.
    Workflow say node: Added a new say node type to agent workflows (WorkflowSayNodeModel). This node supports conversation_config, additional_prompt, and tool and knowledge base overrides. The node's message payload uses a discriminated union with two variants: literal (static text via text) and prompt (LLM-generated via prompt).
    SMB tool configuration: Formalized the SMBToolConfig schema with a required, discriminated params object. Supported operation types are create, list, search, update, and delete, each applicable to SMB entity categories including clients, staff, services, products, and assets.
    Summary language for agents: Added summary_language field to agent configuration to specify the language for post-conversation summaries.
    Twilio call recording: Added optional call_recording_enabled field to the Twilio outbound call request body to enable recording of outbound calls.
    WhatsApp messaging flag: Added enable_messaging flag to WhatsApp account models and update requests to control whether messaging is enabled for a given phone number.

    Pronunciation Dictionaries

    Set all rules at once: New Set rules endpoint (POST /v1/pronunciation-dictionaries/{pronunciation_dictionary_id}/set-rules) replaces all existing rules in a pronunciation dictionary in a single call, as an alternative to incrementally adding or removing individual rules.
    Rule matching options: Added case_sensitive (boolean) and word_boundaries (boolean) fields to pronunciation dictionary rule definitions for more precise control over how rules match text.

    Music

    Upload audio: New Upload music endpoint (POST /v1/music/upload) accepts a multipart/form-data request with an audio file. Optionally extracts a composition plan from the uploaded audio and returns a MusicUploadResponse with a song_id.
    Phonetic name support: Added use_phonetic_names parameter to Generate music, Generate music detailed, and Stream music endpoints.
    Song ID in responses: Music generation responses now surface the song_id field, making it easier to reference generated songs in subsequent API calls.

    Audio Native

    Update project content from URL: New Update content endpoint (POST /v1/audio-native/content) allows updating an AudioNative project by providing a URL. The endpoint extracts the content from the URL and queues it for conversion and auto-publishing.

    Voices

    Voice bookmarking: Added bookmarked field to voice update requests and is_bookmarked to voice response objects, enabling users to bookmark voices for easy access.

    SDK Releases

    JavaScript SDK

    v2.37.0 - Added support for surfacing song_id in music generation responses. Updated to include latest API schema changes from Fern regeneration.

    Python SDK

    v2.37.0 - Added support for new music generation parameters and surfacing song_id in music generation responses. Updated to include latest API schema changes from Fern regeneration.

    Packages

    @elevenlabs/[email protected] - Propagated event_id through transcript and streaming callbacks. Refactored tool status tracking from Map-based to inline transcript entries with a display-transcript utility. Added show-conversation-id config option (boolean, defaults to true) to control visibility of conversation ID in disconnection messages.
    @elevenlabs/[email protected] - Propagated event_id through transcript and streaming callbacks. Refactored tool status tracking to inline transcript entries.
    @elevenlabs/[email protected] - Propagated event_id through transcript and streaming callbacks.

    API

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from Murf and hundreds of other software products.

  • Feb 23, 2026
    • Date parsed from source:
      Feb 23, 2026
    • First seen by Releasebot:
      Feb 25, 2026
    Eleven Labs logo

    Eleven Labs

    February 23, 2026

    ElevenAgents rolls out broad MCP server auth upgrades with OAuth2 and multiple auth headers, plus a new LLMs discovery endpoint and conversation redaction. It also adds guardrails, search, file uploads, a new embedding model, workspace permissions for image/video generation, and updated SDKs for multiple languages.

    ElevenAgents

    OAuth and advanced authentication for MCP servers: MCP servers now support workspace auth connections, enabling OAuth2 Client Credentials, Basic Auth, Bearer Auth, JWT, and custom header authentication. Select an auth connection when creating or editing an MCP server to automatically handle token refresh and authentication headers. See the MCP server documentation for more details.

    LLM information endpoint: Added a new endpoint to list available LLMs with deprecation status, capabilities, and context limits. This enables clients to programmatically discover which models are available and which are being deprecated. See List Available LLMs for details.

    Conversation history redaction: Added support for redacting sensitive information from conversation transcripts, audio, and analysis before being stored. Configure specific entity types to redact (such as names, email addresses, etc.) using the new conversation_history_redaction setting in agent privacy configuration. See the docs page for more details.

    Enhanced guardrails: Introduced two new guardrail types for agents:

    • focus: Helps keep conversations on-topic
    • prompt_injection: Detects and prevents prompt injection attempts
      Note: The alignment guardrail has been removed

    Conversation search endpoints: Added two new endpoints for searching conversation messages:

    • Text search - Full-text and fuzzy search over transcript messages
    • Smart search - Semantic search using embeddings

    File uploads in conversations: Added endpoints to upload and manage files within conversations:

    • Upload file - Upload files to a conversation
    • Delete file - Remove uploaded files

    New embedding model: Added support for qwen3_embedding_4b embedding model in knowledge base RAG indexing.

    Workspaces

    Image and video generation permission: Workspace administrators can now control access to image and video generation features. This new permission allows you to restrict which workspace members can use these capabilities. Configure this in your workspace settings under member permissions.

    Music

    Workspace sharing: Songs can now be shared within workspaces. The songs resource type has been added to workspace resource sharing endpoints.

    SDK Releases

    JavaScript SDK

    v2.36.0 - Added overloaded convert method signatures to Speech to Text wrapper for improved type safety and ergonomics. Updated SDK to include latest API schema changes including LLM list endpoint, conversation search, and MCP auth connection support.

    Python SDK

    v2.36.1 - Added missing music generation parameters including seed, loudness, quality, and guidance_scale to ensure full feature parity with the API.
    v2.36.0 - Renamed package references from "Conversational AI" to "ElevenAgents" to reflect the product rebrand. Updated SDK to include latest API schema changes including LLM list endpoint, conversation search, and MCP auth connection support.

    Packages

    API

    Original source Report a problem
  • Feb 16, 2026
    • Date parsed from source:
      Feb 16, 2026
    • First seen by Releasebot:
      Feb 16, 2026
    Murf logo

    Murf

    How to Translate a Page in Opera GX: Complete Guide for Gamers (2026)

    Opera GX now ships a built-in translator that supports 40 languages, making global gaming content instantly accessible with one click. The feature, AI-powered and integrated, delivers fast, private translations without extensions for millions of GX users.

    Opera GX built-in translator

    Opera GX now comes with a built-in translator designed for gamers, supporting 40+ languages out of the box. Whether you’re browsing Japanese RPG forums, Korean esports sites, or European patch notes, translation happens instantly with a single click. This guide walks you through how to translate a page in Opera GX step by step so you never miss global gaming insights.

    Global gaming communities thrive on content from every corner of the world, from Japanese RPG forums to European esports coverage. However, language barriers often make it difficult to access these insights in real-time. To address this, Opera introduced a major upgrade to its built-in translation feature in July 2025, designed specifically for Opera GX users.

    The new Opera GX Translate supports over 40 languages. It automatically detects foreign-language pages using advanced AI, and then renders them into your preferred language. Unlike third-party extensions, this capability is integrated directly into the browser, ensuring speed, accuracy, and privacy without additional setup.

    In this guide, we’ll walk through how to translate a page in Opera GX step by step, helping you unlock a seamless browsing experience across global gaming content.

    Why Translate Pages in Opera GX?

    Opera GX Translate becomes essential when you realize that 51% of the internet is not in English. For Opera GX's 25+ million monthly active users, mostly Gen-Z gamers, this creates massive barriers to accessing global gaming content.

      1. Gaming Community Access
        The best strategies, rare game guides, and breaking news in esports often drop first on region-specific sites.
        For example, Japanese outlets such as Famitsu publish post-patch developer interviews that explain balance decisions and the best boss strategies.
        Knowing how to translate a website on Opera GX unlocks content from Japanese gaming magazines, European tournament coverage, and Korean pro-gaming insights that never get English translations.
      1. Research and Reviews
        The Translate feature in Opera GX lets you access authentic user reviews on international gaming platforms, compare regional game prices, and discover indie titles that haven't reached Western markets yet.
        For instance, Chinese storefront reviews often flag performance bugs or community mods before English sites mention them.
        With Opera GX’s built-in translator, you can scan those insights in seconds and make smarter buy-or-play decisions without juggling extra tabs.
      1. Global Accessibility
        For non-native English speakers, knowing how to translate on Opera GX makes English-dominant gaming forums, wikis, and streaming platforms far more accessible.
        Studies show that engagement is 88% higher when content appears in a user’s native language. Translation unlocks contributions such as posting strategies, updating walkthroughs, or asking better questions.
        With Opera GX, a one-click translation turns dense threads and patch notes into clear, actionable information. This helps you learn faster and stay active in the communities that matter most.
      1. Content Creation Opportunities
        Streamers and content creators use translation to research international gaming trends, discover viral gaming content from other regions, and create multilingual content.
        This connects to broader content strategies, whether you're creating gaming videos that require a video translator for global reach, or researching how to monetize YouTube through international gaming markets.
        The gaming world is inherently global, with tournaments, communities, and innovations happening across dozens of languages daily.
        The Opera GX translate page functionality removes the barriers that would otherwise limit your gaming knowledge and creative reach.
        Whether you're hunting rare game guides or following international tournaments, it keeps you in the action with nothing lost in translation.

    How to Translate a Page on Opera GX (Built-in Method)

    Opera's game-changing built-in translation feature makes it ridiculously simple and easy to translate web pages. With Opera GX Translate supporting 40+ languages powered by Lingvanex AI, you can access global gaming content without any extensions.

    Step-by-Step Setup

    Opera GX Translate comes enabled by default, but here's how to verify:

    • Open Settings: Press Alt+P or click the three-line menu → Settings
    • Navigate to Translation: Go to Advanced → Opera Translate. Or, simply type “translate” in the search bar on the left
    • Enable Translation: Toggle on "Use Opera Translate"
    • Set Default Language: Choose your preferred translation target language in the “Translate into this language” section
      That's it! Now you know how to translate on Opera GX and start using it instantly. If you ever want to disable the feature, simply turn the toggle off in your settings.

    Translating Pages

    Opera GX Translate works in three ways:

    • Automatic Popup: Opera GX can translate entire web pages in an instant. When you visit a web page in a different language, Opera GX detects it and shows a translation prompt. Click “Translate” to convert the entire page instantly.
    • Address Bar Icon: Look for the translation icon (often shown as letters "Aa") in your address bar. Click it to select your target language from the dropdown. This method is particularly useful to translate entire tabs in Opera GX.
    • Right-Click Menu: Right-click anywhere on the page and select "Translate" to convert the entire page. To translate only a portion, highlight the text and use Aria’s Translate prompt (from the selection popup or the right-click menu). You can also ask Aria questions about the page in your preferred language.
      Pro Tip: Set frequently used languages to "Always translate" to streamline future gaming research and avoid repeated prompts.

    How to Change the Language of a Page in Opera GX

    Translating a page in Opera GX doesn't stop at one click; you can switch target languages mid-session with ease.

    1. Address Bar Icon: Click the translation icon (letters "Aa") in the address bar and select a new target language from the dropdown.
    2. Settings Override: Navigate to Settings > Advanced > Opera Translate to change your default translation language for all future pages.
    3. Re-translate: Right-click the page, select "Translate," and choose a different language to instantly re-translate the content.
      Pro Gaming Tip: Add multiple gaming languages (e.g., Japanese for strategy guides, Korean for esports coverage) to switch quickly while researching international communities.

    How to Translate Tabs or Entire Websites

    Opera GX Translate works page-by-page; it doesn’t translate multiple tabs at once. But with a few tweaks, you can still streamline heavy research sessions.

    • Multiple Gaming Forums: Set frequently visited sites (like Reddit gaming communities or Steam forums) to "Always translate" by clicking the translation icon and toggling auto-translation for those domains.
    • Streamlined Workflow: Right-click any tab and select "Translate" for instant conversion. This makes it second nature to follow international esports coverage or browse non-English guides.
      Pro Setup: Pin translated tabs to keep your language preferences across sessions. For example, if you’re creating YouTube content for a non-English audience, keeping translated tabs pinned saves time and keeps your research organized.

    Using AI for Multilingual Gaming Content

    While Opera GX handles text translations well, creators aiming for global reach often need multilingual audio. AI-powered dubbing tools can extend your content strategy beyond the browser, making your videos, tutorials, and streams accessible worldwide.

    Murf AI offers creator-focused translation tools that help streamers, gamers, and content makers adapt their content into 40+ languages and 200+ voices with AI-powered voice translation technology.
    Unlike traditional alternatives, Murf's AI Audio and Video Translators preserve your unique voice and style across languages like Japanese, Spanish, and German.
    Here’s how it works:

    • Upload your audio file, which could be a game guide, a tutorial, or even stream highlights. Choose the source and target languages in Murf’s Audio Translator, and your file is instantly converted.
    • Murf’s AI ensures natural intonation and timing, delivering professional-quality multilingual audio without the need for separate voice actors in each market.
    • You can also translate video to video using Murf’s free AI Video Translator. It offers 10X faster turnaround, instant voice cloning, and precise pronunciation control for every translation. Murf is one of the best AI tools for content creators looking to increase productivity and reach while staying true to their authentic voice.

    How to Add Google Translate to Opera GX

    If you prefer Google’s translation interface, you can add it to Opera GX through the Chrome Web Store:

    1. Open the Chrome Web Store in Opera GX
    2. Search for “Google Translate”
    3. Click "Add to Opera"
      That’s it! Once installed, the Google Translate extension adds a translation pop-up, context menu options, and keyboard shortcuts (such as Ctrl+Alt+Z).
      This gives you an alternative to Opera GX’s built-in translator while keeping the familiar Google Translate experience.

    How to Disable Translation in Opera GX

    Sometimes you may not need translation at all, or you might want it disabled only on specific sites. Opera GX makes this easy to control.

    • Full Disable: Go to Settings → Advanced → Opera Translate and toggle off “Use Opera Translate”
    • Selective Disable: Click the translation icon in the address bar and choose “Never translate this language” or “Never translate this site“
      This way, Opera GX won’t translate pages you don’t need, while still helping you access the languages you care about.

    Conclusion

    Opera GX’s built-in translation feature marks a significant step toward a more connected global gaming community. By removing language barriers, it empowers players, creators, and esports enthusiasts to access strategies, reviews, and conversations that were once out of reach. With support for 40+ languages, instant AI-powered translations, and seamless integration, Opera GX makes global gaming knowledge accessible in just a click, without extensions or added complexity.
    For creators, the opportunities extend even further. While Opera GX handles text-based translation, Murf AI enables multilingual audio content.
    This enables streamers, educators, and gaming influencers to connect with global audiences while preserving their authentic voice. Together, these tools redefine how gamers consume and share content across borders.
    In a world where gaming is inherently global, knowing how to translate a page in Opera GX is more than a convenience; it’s a competitive advantage. Pair it with Murf’s AI-driven voice solutions, and you have everything you need to participate, create, and lead in the international gaming ecosystem.

    Original source Report a problem
  • Feb 16, 2026
    • Date parsed from source:
      Feb 16, 2026
    • First seen by Releasebot:
      Feb 16, 2026
    Eleven Labs logo

    Eleven Labs

    February 16, 2026

    ElevenLabs rolls out a sweeping update with a new conversation users endpoint, agent versioning, and a built in search tool for retrieval augmented generation. It adds MCP tool support, content guardrails, expressive mode, post dial digits, new testing types, enhanced pronunciation rules, and updated SDKs and widgets.

    ElevenAgents

    • Conversation users endpoint: Added Get conversation users endpoint (GET /v1/convai/users) to list users who have had conversations with your agents. Supports pagination and filtering by agent, time range, and other criteria.

    • Agent versioning: Fetch specific agent configurations by version using the new version_id and branch_id query parameters on the Get agent endpoint. The Update agent endpoint also now accepts a branch_id parameter for branch-specific updates.

    • Search documentation tool: Added search_documentation as a new built-in system tool for RAG (retrieval-augmented generation). Configure multi-source retrieval with MultiSourceConfigJson, SourceConfigJson, and SourceRetrievalConfig schemas. Supports configurable merging strategies via the MergingStrategy enum.

    • MCP tool support: Added mcp as a new tool type alongside webhook, client, and system tools. Configure MCP tools using the new MCPToolConfig schema in your agent tool definitions.

    • Content guardrails: Added content moderation guardrails to GuardrailsV1 with configurable thresholds for different content categories: sexual, violence, harassment, self-harm, profanity, religion/politics, and medical/legal content. Use the new ContentGuardrail and ContentConfig schemas.

    • Expressive mode: Added expressive_mode field (boolean) to agent configuration schemas to automatically prompt your agent to make the most of the new v3 conversational model.

    • Post-dial digits: Phone number transfers now support post_dial_digits for sending DTMF tones after connection. Configurable as static values or dynamic variables.

    • Agent testing types: Agent testing now supports three distinct test types with dedicated schemas: llm (response evaluation), tool (tool-call verification), and simulation (full conversation simulation). Simulation tests include simulation_scenario and simulation_max_turns configuration fields.

    • Pronunciation Dictionaries

    • Rules in API response: The Get pronunciation dictionary endpoint now returns a rules array containing full rule details. Response uses the new GetPronunciationDictionaryWithRulesResponseModel schema with PronunciationDictionaryAliasRuleResponseModel and PronunciationDictionaryPhonemeRuleResponseModel for rule types.

    • Knowledge Base

    • Folder deletion: The Delete knowledge base document endpoint now supports deleting folders in addition to documents. When deleting folders, set force=true to enable recursive deletion of all contained documents and subfolders.

    • SDK Releases

    • Python SDK

    • v2.36.0 - Added conversation users endpoint, agent versioning, search documentation tool, MCP tool support, content guardrails, and agent testing types

    • JavaScript SDK

    • v2.36.0 - Added overloaded convert signatures to speechToText.convert() for improved type inference based on request parameters (access .text, .transcripts, or webhook fields without manual type narrowing), added conversation users endpoint, agent versioning, search documentation tool, MCP tool support, content guardrails, and agent testing types

    • Widget Packages

    • @elevenlabs/[email protected] - Added agent tool usage status display and new status badge for long-running tool calls, fixed emotion tag stripping, fixed rating and feedback submission for signed-url widget embedding

    • @elevenlabs/[email protected] - Added agent tool usage status display and new status badge for long-running tool calls, fixed emotion tag stripping

    • API

    Original source Report a problem
  • Feb 9, 2026
    • Date parsed from source:
      Feb 9, 2026
    • First seen by Releasebot:
      Feb 10, 2026
    Eleven Labs logo

    Eleven Labs

    February 9, 2026

    ElevenLabs shifts to a platform structure with three families and global routing as default, deprecating old preview URL. It also ships TTS gains, custom guardrails, WhatsApp outbound messaging, Eleven v3 models, and extensive SDK updates across Python, JavaScript, and React ecosystems.

    ElevenAgents, ElevenCreative and ElevenAPI

    We’re moving from a single-product perception (“ElevenLabs”) to a platform-based structure with clearly defined product families:

    • ElevenAgents (Formerly Agents Platform)
    • ElevenCreative (Formerly Creative Platform)
    • ElevenAPI (New, the Developer Platform)
      You’ll already see this reflected in the docs and SDK readmes.

    Global servers out of beta

    Global routing is now the default rather than opt-in.

    Previously the default ElevenLabs API server was located in the United States, with an opt-in beta for routing traffic through the Netherlands or Singapore based servers. As of now global routing is the default, meaning that the server will automatically be chosen based on geographic proximity to optimize latency.

    The opt-in base URL

    api-global-preview.elevenlabs.io is now deprecated, please use the default api.elevenlabs.io base URL instead. If you’re using the SDKs, this is already the default.

    Text to Speech

    • TTS Normalizer v3.1: Upgraded the text normalizer to version 3.1, which includes improved accuracy and lower latency for text-to-speech conversion.

    Agents Platform

    • Custom guardrails: Added support for user-defined output guardrails that allow you to create custom content filtering rules beyond standard moderation. Configure guardrails with a name, prompt instruction (up to 10,000 characters), and choice of evaluation model (gemini-2.5-flash-lite or gemini-2.0-flash). When triggered, the guardrail ends the conversation. See the Security documentation for details.
    • WhatsApp outbound messaging: Added Send outbound message endpoint (POST /v1/convai/whatsapp/outbound-message) to initiate conversations via WhatsApp using message templates. Required fields include whatsapp_phone_number_id, whatsapp_user_id, template_name, template_language_code, template_params, and agent_id.
    • Eleven v3 conversational model: Added eleven_v3_conversational to the available TTS models for agents, providing improved voice quality and expressiveness in agent conversations.
    • Suggested audio tags: Added suggested_audio_tags field to TTS configuration for agents using v3 models. Define up to 20 tags (e.g., “happy”, “excited”) to guide expressive speech generation, with optional descriptions for when each tag should be used.
    • Tool error handling: Added tool_error_handling_mode field to webhook tool configurations with options: auto (default, determines handling based on tool type), summarized (sends LLM-generated summary), passthrough (sends raw error), or hide (does not share error with agent).
    • Dynamic variable sanitization: Added sanitize field (boolean, default false) to DynamicVariableAssignment. When enabled, the assignment value is removed from tool responses and transcripts while still being processed for variable assignment.
    • Turn model selection: Added TurnModel enum with turn_v2 and turn_v3 options for selecting the turn detection model version.
    • Workflow node transfers: Added is_workflow_node_transfer field (boolean, default false) to AgentTransfer schema for identifying transfers within workflow nodes.
    • Transfer branch metadata: Added TransferBranchInfoTrafficSplit and TransferBranchInfoDefaultingToMain schemas for tracking branch routing information in agent transfers.
    • Workflow node transition testing: Added workflow_node_transition assertion type for unit tests with UnitTestWorkflowNodeTransitionEvaluationNodeId schema to validate agent workflow transitions.

    Studio

    • Muted tracks endpoint: Added Get project muted tracks endpoint (GET /v1/studio/projects/{project_id}/muted-tracks) that returns a list of chapter IDs with muted tracks in a project.

    Workspaces

    • Lite member seat type: Added workspace_lite_member to the SeatType enum for workspaces with limited access permissions.
    • Content templates resource: Added content_templates to WorkspaceResourceType enum for sharing content templates within workspaces.

    User Interface

    • Voice collection scrolling: Fixed an issue preventing mouse wheel and touch scrolling in the voice collection icon picker on macOS.

    SDK Releases

    Python SDK

    • v2.35.0 - Added custom guardrails, WhatsApp outbound messaging, tool error handling mode, dynamic variable sanitization, turn model selection, and Eleven v3 conversational model support

    JavaScript SDK

    • v2.35.0 - Added custom guardrails, WhatsApp outbound messaging, tool error handling mode, dynamic variable sanitization, turn model selection, and Eleven v3 conversational model support

    React and Client SDKs

    • @elevenlabs/[email protected] - Fixed establishing text-only conversations
    • @elevenlabs/[email protected] - Reduced audio chunk length from 250ms to 100ms for lower latency in agent conversations
    • @elevenlabs/[email protected] - Reduced audio chunk length from 250ms to 100ms for lower latency, normalized textOnly option handling between top-level and overrides object
    • @elevenlabs/[email protected] - Added types for audio alignment data support

    API

    • View API changes
    Original source Report a problem
  • Feb 2, 2026
    • Date parsed from source:
      Feb 2, 2026
    • First seen by Releasebot:
      Feb 3, 2026
    Eleven Labs logo

    Eleven Labs

    February 2, 2026

    Eleven v3 exits alpha with a more stable, faster Text-to-Dialogue platform. WAV outputs, agent branch rename, alignment safeguards, speculative turn config, procedure refs and webhook filtering expand capabilities along with telemetry, permissions, and SDK updates.

    v3 is out of alpha - it’s more stable, accurate and has lower latency. Read more about Eleven v3.

    Text-to-Dialogue

    WAV output formats: Text-to-Dialogue endpoints now support WAV output formats (wav_8000, wav_16000, wav_22050, wav_24000, wav_32000, wav_44100, wav_48000) in addition to existing MP3, PCM, OPUS, and other formats. WAV formats with 44.1kHz sample rate require a Pro tier subscription or above.

    Agents Platform

    Agent branch renaming: You can now rename agent branches using the Update branch endpoint. The new optional name field accepts 1-140 characters.
    Alignment guardrails: Added AlignmentGuardrail type to GuardrailsV1 schema for enhanced conversation safety controls.
    Speculative turn configuration: Added speculative_turn field to turn configuration for fine-tuning agent turn-taking behavior.
    Procedure references: Agent patch requests now support procedure_refs for referencing reusable procedures.
    Webhook response filtering: Added response_filter_mode and response_filters fields to webhook overrides, with new ResponseFilterMode enum for controlling which response data is passed through.
    Error details in tool events: Added raw_error_message field to API integration webhook, system, and workflow tool event models for improved debugging.
    Flexible tool call matching: Testing models now include check_any_tool_matches option to relax tool call matching requirements during agent testing.
    Secrets pagination: Get workspace secrets endpoint now supports pagination with page_size (max 100) and cursor query parameters, returning next_cursor and has_more in responses.

    Workspaces

    Permission clarifications: Workspace group and invite endpoints now document specific permission requirements (group_members_manage for group member operations, WORKSPACE_MEMBERS_INVITE for workspace invitations) instead of requiring workspace administrator status.
    New permission types: Added group_members_manage and terms_of_service_accept to the PermissionType enum.

    User

    Compliance terms visibility: Added show_compliance_terms field to user response model.

    Metrics

    ASR provider tracking: Added convai_asr_provider field to metrics for tracking automatic speech recognition provider usage.

    SDK Releases

    Python SDK

    • Added WAV output formats for Text-to-Dialogue, webhook response filtering, agent branch renaming, secrets pagination, and speculative turn configuration
    • Fixed bug with URL streaming in Scribe
    • Fixed bug with agent initialization and user_id handling, added audio alignment callback for agent conversations

    JavaScript SDK

    • Added WAV output formats for Text-to-Dialogue, webhook response filtering, agent branch renaming, secrets pagination, and speculative turn configuration

    React and Client SDKs

    • Reduced audio chunk length from 250ms to 100ms for lower latency in agent conversations
    • Fixed issue where input audio would not re-establish after microphone permission revocation

    Widget Packages

    • Fixed microphone mute state reset when call ends to prevent UI/audio desync on subsequent calls
    • Fixed styling issue in shadow root
    • Updated Tailwind to v4, added optional dismissable widget parameter, and fixed microphone permission handling

    API

    View API changes

    Original source Report a problem
  • Jan 30, 2026
    • Date parsed from source:
      Jan 30, 2026
    • First seen by Releasebot:
      Feb 11, 2026
    Resemble logo

    Resemble

    We Built a Deepfake Detection Bot for X, Because We All Deserve to Know What’s Real

    Resemble launches @resemble_detect on X, a free public bot that checks if an image or video is AI-generated or manipulated and replies with a visualization plus a confidence score. It aims to help users verify content amid pervasive synthetic media.

    Today we’re releasing @resemble_detect, a free bot that lets anyone on X check whether an image or video has been AI-generated or manipulated.
    We’re doing this because social media platforms are flooded with synthetic media and most people have no way to verify what they’re looking at. That’s a problem we can help solve. Here’s how it works, and what this tool is for, and what it isn’t.

    How It Works

    Using @resemble_detect is simple:

    • Find an image or video on X that you want to verify
    • Reply to the post and tag @resemble_detect with the phrase “is this fake?”
    • We’ll analyze the content and reply with a visualization (for images) and a confidence score indicating the likelihood that the content is AI-generated or manipulated

    That’s it. Free, public, no account required beyond your existing X profile.
    The detection is powered by Resemble AI’s DETECT technology, the same models that rank at the top of independent benchmarks and process millions of verifications for enterprises, governments, and media organizations.

    The Problem Is Bigger Than Deepfakes

    The conversation around AI-generated content tends to flatten into a simple binary: real or fake, but the reality is messier.
    Content exists on a spectrum of authenticity. An image might be entirely AI-generated, or it might be a real photograph with minor edits, or something in between. A real person’s face swapped onto another body. Authentic footage with manipulated audio. A genuine video clipped and recontextualized to misrepresent what happened. Text, images, audio, video, all of it can be altered in ways ranging from trivial touch-ups to complete fabrication.

    Misinformation works the same way. Sometimes it’s entirely invented. Sometimes it’s a real story with one detail changed, a date, a name, a number, and that transforms truth into falsehood. The manipulation can be subtle or total, like the recent X posting by the White House.
    Our detection technology handles this spectrum. We can identify fully synthetic content, partially manipulated media, and everything in between. We return a confidence score because the answer often isn’t binary, and unfortunately, this kind of content is picked up at the speed of a repost.
    But there’s one dimension where there is no spectrum: consent.

    Consent Is Binary

    You either have permission to share something or you don’t. There’s no gradient here, no “partially consensual,” no gray area.
    This matters because the worst uses of synthetic media aren’t about political misinformation or celebrity face-swaps. They’re about non-consensual intimate imagery. They’re about harassment, exploitation, and abuse. They’re about using AI to create content that real people never agreed to, then posting it publicly to humiliate, threaten, or harm them.
    We do not and can not verify consent.
    What we can tell you is whether content appears to be AI-generated or manipulated. What we cannot tell you is whether the subject of that content agreed to its creation or distribution. That determination requires context, investigation, and often legal process that no API can replicate.
    This distinction is critical, we refuse to blur it.

    What We Will Not Do

    Let us be unambiguous:
    We do not condone the creation or distribution of non-consensual synthetic media. Full stop. Whether the content is AI-generated or authentic, if it’s shared without consent, it’s a violation.
    We do not process explicit content through this public bot. The @resemble_detect bot is not a tool for analyzing explicit imagery. If you’re trying to use it for that purpose, don’t. Our Terms of Service prohibit it. If this is a recurring issue for you personally, please reach out to us at [email protected] and we will determine whether we can process your requested media.
    We find the distribution of CSAM repugnant and will cooperate fully with law enforcement. This should go without saying. We’re saying it anyway.
    We built detection technology because we believe people deserve to know when they’re being deceived. We did not build it to enable new forms of abuse. If you’re planning to use our tools to harm someone, find another platform, you’re not welcome here.

    Why We’re Releasing This Anyway

    We’re aware that any tool capable of detecting synthetic media could theoretically be misused. Someone could use detection to “verify” non-consensual content before sharing it. Someone could use a “real” verdict to add false credibility to manipulated media or claim that if it’s real, it can be posted.
    But we do believe the alternative is worse. Synthetic media is already everywhere on X. Deepfakes are already being posted, shared, and believed. The asymmetry between creation and detection has left most people defenseless, unable to question what they see, unable to verify what they’re told.
    Giving people a free, accessible way to check content doesn’t solve every problem. But it shifts the balance, it creates friction where there was none. It makes “is this real?” a question people can actually answer, instead of something they have to guess at. And just to reiterate, even if something is real or appears authentic, that doesn’t equate to consent.
    We’d rather build tools that help people navigate this landscape than pretend the landscape doesn’t exist.

    We All Deserve to Know What’s Real

    We’re releasing this tool because the current state of synthetic media on social platforms is untenable. People are being deceived at scale, and trust in the media and each other is eroding.
    This is our attempt to put detection capabilities directly in the hands of the people who need it most.
    It won’t solve everything. It won’t stop bad actors. It won’t verify consent or intent or context. But it will let you ask “is this fake?” and get an actual answer.

    Try it now: Tag @resemble_detect on any image or video post with “is this fake?”
    Read our Terms of Service: https://www.resemble.ai/resemble-ai-x-bot-terms-of-service/
    Questions? Contact us at [email protected].

    Original source Report a problem
  • Jan 29, 2026
    • Date parsed from source:
      Jan 29, 2026
    • First seen by Releasebot:
      Jan 30, 2026
    Murf logo

    Murf

    AI Audiobook Narration: The Future Of Storytelling?

    Apple launches AI narrated audiobooks with four voices in Books, delivering studio‑quality, accessible storytelling. This shift cuts production time and costs, reshapes author workflows, and nudges the audiobook market toward broader accessibility and competition.

    The Shift in Audiobook Production

    AI audiobook narration is revolutionizing storytelling with lifelike voices. Apple's AI narration service and tools like Murf make audiobooks more accessible and affordable. As AI evolves, it will expand audiobook markets, enhance accessibility, and transform storytelling.

    Imagine a world where the stories you love come alive with crystal clear, lifelike narration. No more boring, monotonous voices or robotic inflections. Instead, you are transported into the heart of the story, as the characters and their emotions are brought to life with rich, nuanced performances, and the story unfolds in a way that is both captivating and immersive. With AI audiobook narration, this future is now a reality!

    Rapid advancements in AI have opened up the potential for more nuanced and 'human-like' narration. AI voices now sound much more natural than digitally generated voices, leading to fears that they could replace human narrators altogether. And, that possibility received a major expansion in early January from Apple after it announced a new AI-powered digital narration service for audiobooks.

    Apple launched four new AI voices, 'Madison' and 'Jackson' optimized for romance and fiction genres and 'Helena' and 'Mitchell' for nonfiction, aiming to make the creation of audiobooks more accessible to all. The service is currently available only in English, and users can find the audiobooks listed in the Books app as "Narrated by Apple Books." Apple Books' digital narration leverages advanced speech synthesis technology to produce high-quality audiobooks from an ebook file. The tech giant has long been on the forefront of innovative speech technology, and has now adapted it for long-form reading, working alongside publishers, authors, and narrators. Through this new feature, Apple remains committed to celebrating and showcasing the magic of human narration and will continue to grow its AI-narrated audiobook catalog.

    The Shift in Audiobook Production

    Audiobooks are a lucrative and fast-growing market. Their sales and popularity have skyrocketed in recent years, with technology companies scrambling to gain a foothold. Industry insiders believe the global market will be worth more than $35bn by 2030. Driving this rapid surge is the traditional time-consuming and cost-intensive process of producing audiobooks with human voice actors.

    The current audiobook model involves authors narrating their own books or commissioning professional voice actors to record the audio version of their books. This process can take weeks and cost thousands for a publisher. For independent authors, especially those just starting out, funding such a production can be challenging. AI narration promises to significantly cut these costs and allow smaller publishers and authors to put out an audiobook in the market at a competitive price.

    Today, there are several text to speech software out in the market that help users produce AI-narrated audiobooks using synthetic voices at a fraction of the cost and time it takes to do so manually. Among them is Murf, an AI voice generator that enables authors and publishers to create audiobooks using 200+ natural-sounding AI voices across 20+ languages and multiple different accents.

    Creating an audiobook with Murf only takes minutes. Upload your script or the text version of your book to Murf's text editor, choose AI voices for different characters in your story, use Murf's voice customization options such as speed, pitch, and volume to fine-tune the narration, include background music to add more depth to the storytelling, and render. Bingo! Your studio-quality audiobook is ready for rollout, in no time. You don't have to invest in costly recording equipment or hire a professional audiobook narrator. Additionally, with Murf's voice cloning service, self-published authors can create an AI voice clone of their own voice and use it to produce their audiobooks.

    There are millions of books out there that aren't available to individuals with disabilities, making it difficult or impossible for them to read ebooks or print books. The addition of easy-to-produce audio versions opens up a wealth of content. Through software like Murf, authors can bring audio to as many books and as many people as possible.

    The Future of Audiobooks

    Apple's approach to digital narration is not a first of its kind. Several other tech companies, including Google and Spotify, have also been investing in making audiobooks a key pillar of their streaming service. However, Apple's latest move has led to a substantial change in the audiobook industry. With more and more book lovers listening to audiobooks, the demand for AI-generated audiobooks will increase. This, in turn, will lead to more investments in the development of AI narration technology to further improve its quality and make it more widely available. Other outcomes of Apple's entry into the market include:

    • Intense competition, which would result in lower prices for audiobooks, potentially making them more accessible to a wider range of consumers
    • Increased collaboration between tech companies and traditional audiobook publishers, contributing to new and innovative products that combine the strengths of both industries.

    It's undeniable that AI will have a big role in future audiobooks. Instead of taking weeks to record, edit, and produce a book, it can be done in a day. By creating an audiobook version of their titles, authors can not only earn a potential income but also the opportunity to build their brand and following while the market is still growing. AI narration gives a chance to all those new books that aren't licensed for audio due to the cost of production, overlooked backlists, and books in minority languages to find a voice, literally.

    As AI narration continues to evolve, it will be exciting to see how it will change the way we experience audiobooks and storytelling. Whether it's through the seamless integration of text to speech technology with our devices or the creation of new, more immersive storytelling experiences, the possibilities are endless.

    Original source Report a problem
  • Jan 26, 2026
    • Date parsed from source:
      Jan 26, 2026
    • First seen by Releasebot:
      Jan 26, 2026
    Eleven Labs logo

    Eleven Labs

    January 26, 2026

    Agents Platform introduces full version control for agents with branching, merges and deployments plus draft management, enabling isolated experimentation. WhatsApp account management and enhanced conversation tracing round out the updates. Knowledge Base folders, advanced tool filtering, and broader studio, workspaces, and SDK enhancements ship.

    Agents Platform

    • Agent branching and deployments: Added a complete version control system for agents, enabling teams to create branches, iterate on agent configurations in isolation, and merge changes when ready. New endpoints include POST /v1/convai/agents/{agent_id}/branches for creating branches, POST /v1/convai/agents/{agent_id}/branches/{source_branch_id}/merge for merging, and POST /v1/convai/agents/{agent_id}/deployments for creating deployments. Drafts can also be created and deleted via the new drafts endpoints. See the Branches and Deployments documentation for details.
    • WhatsApp account management: Added PATCH and DELETE endpoints for WhatsApp Business accounts, allowing you to update and remove connected WhatsApp accounts from agents.
    • Conversation agent name: The Get conversation endpoint now returns agent_name in the response for easier identification of which agent handled a conversation.
    • Error type tracking: Added error_type field to conversation event models for improved debugging and error categorization.

    Knowledge Base

    • Folder management: Added support for organizing knowledge base documents into folders. New endpoints include Create folder for creating folders, Move document for moving single documents, and Bulk move for moving multiple documents at once.

    Tools

    • Enhanced tools listing: The Get tools endpoint now supports filtering and pagination with new query parameters including search, page_size, types, sort_by, sort_direction, and cursor. Response now includes next_cursor and has_more fields for pagination.

    Music

    • Song metadata enhancements: Added bpm and time_signature fields to song metadata for richer audio analysis information.

    Studio

    • Caption style templates: Added caption_style_template_overrides field to project models, allowing customization of caption styling per template.
    • Video dubbing project type: Added dub_video to the project creation type enum.
    • Publishing metadata: Added last_updated_from_project_unix timestamp to publishing and project metadata.

    Workspaces

    • Seat type management: Introduced new SeatType enum with seat_type and workspace_seat_type fields, deprecating the previous workspace_permission and workspace_role fields.
    • Workspace analytics permission: Added workspace_analytics_full_read to the PermissionType enum for granular analytics access control.

    SDK Releases

    Python SDK

    • v2.32.0 - Added agent branching, deployments, and drafts endpoints, knowledge base folder management, enhanced tools listing with filtering and pagination, and seat type management

    JavaScript SDK

    • v2.33.0 - Added agent branching, deployments, and drafts endpoints, knowledge base folder management, enhanced tools listing with filtering and pagination, and seat type management

    API

    • View API changes
    Original source Report a problem