Inworld Release Notes
Last updated: Jan 21, 2026
Inworld Products
All Inworld Release Notes (22)
- Jan 21, 2026
- Date parsed from source:Jan 21, 2026
- First seen by Releasebot:Jan 21, 2026
Inworld TTS 1.5
Inworld TTS 1.5 launches with two models, latency optimized and standard, delivering sub-250ms first chunk for Max and sub-160ms for Mini. It’s 20% more expressive with 25% fewer errors, plus three new languages (Hindi, Arabic, Hebrew) for a total of 15.
Launched Inworld TTS 1.5, our newest generation of realtime TTS models featuring:
- Two New Models: Our flagship model inworld-tts-1.5-max is ideal for most use cases, with the best balance of quality and speed. For use cases where latency is the top priority, we also offer inworld-tts-1.5-mini.
- Latency Improvements: Our new TTS-1.5 models achieve P90 latency for first audio chunk delivery under 250ms for our Max model and under 160ms for our Mini model, a 4x improvement compared to TTS-1.
- More Expressive and More Stable: TTS 1.5 is 20% more expressive than prior generations and demonstrates a 25% reduction in word error rate.
- Additional Languages: We’ve added support for additional languages, including Hindi, Arabic, and Hebrew, bringing total languages supported to 15.
- Jan 21, 2026
- Date parsed from source:Jan 21, 2026
- First seen by Releasebot:Jan 21, 2026
Inworld TTS-1.5: Upgrading the #1 Ranked TTS Model with Production-Grade Latency, Expression and Stability
Inworld unveils TTS-1.5, the fastest realtime voice AI with sub-250ms latency on Max and sub-130ms on Mini, plus 30% more expressiveness and 40% lower word error. It adds 15-language multilingual support, on‑prem options, and clear affordable pricing for global deployment.
Announcing Inworld TTS-1.5. The world's best realtime text-to-speech. <200ms latency. #1 on benchmarks.
We’re releasing Inworld TTS-1.5, the fastest, highest-quality realtime voice AI models available. With time-to-first-audio P90 latency of <250ms for 1.5 Max and <130ms for 1.5 Mini (4x faster than prior generations) and top rankings on independent leaderboards, this release sets a new standard for developers building voice-enabled applications at scale. TTS-1.5 improves on the Inworld models already #1 on leaderboards with 30% greater expressiveness, 40% reduction in word error rates, and enhanced multilingual support. It is also more than 25x lower cost than alternatives. Whether you're powering conversational AI agents, live translation, or interactive media experiences, TTS 1.5 gives you the world’s best text-to-speech without compromise.
Inworld TTS-1.5 Max is recommended for most applications, while TTS-1.5 Mini is optimized for hyper-latency sensitive applications.
Production-grade realtime latency: Professional voice actor quality at human-native speeds
For realtime applications latency isn't just a metric. It's the difference between a natural conversation and an awkward delay. TTS-1.5 delivers breakthrough speed improvements that unlock new categories of realtime experiences.
Our new TTS-1.5 models achieve time-to-first-audio P90 latency under 250ms for our Max model and under 130ms for our Mini model. This is a 4x improvement from prior generations. The Max model now delivers quality previously only achievable at much higher latencies, running nearly as fast as the Mini model while producing richer, more expressive speech.
Engagement-optimized quality: Upgrade every user experience with leading expression and stability
Speed means nothing if quality suffers. TTS 1.5 delivers both. Our models rank #1 on the Artificial Analysis TTS Leaderboard. What makes this ranking particularly meaningful is that it reflects blind comparisons by thousands of real users evaluating which outputs sound more natural and human. When developers and end users consistently choose Inworld TTS over alternatives, that's validation that matters.
Beyond third-party validation, TTS 1.5 is 30% more expressive than prior generations and demonstrates a 40% reduction in word error rate, reducing hallucinations, cutoffs, and artifacts. The result is speech that's virtually indistinguishable from human speaking: emotionally nuanced, contextually aware, and reliably accurate.
An expanded range of expression means support for many new consumer-facing use cases and applications, where the personality of voice really matters to engage, retain and convert every user. Ultimately meaning better business outcomes across the next wave of AI applications.
Unlocked consumer-scale: Enhanced multilingual support, still 25x lower cost than alternatives
State-of-the-art voice AI should be accessible to every developer, from indie hackers building their first voice app to enterprises scaling to millions of users.
Language support now spans 15 languages, with the addition of Hindi and expanded coverage across major world languages. Combined with on-prem deployment options, TTS 1.5 serves global enterprises with diverse requirements for data residency, compliance, and customization.
Most significantly, TTS 1.5 remains 25x more affordable than the next best model, a gap that's only widened as competitors have raised prices. At $0.005 per minute for 1.5 Mini and $0.01 per minute for 1.5 Max, we're keeping our commitment to radically accessible pricing that doesn't force developers to choose between quality and budget.
Model Price (in characters) Price (per minute) Best For
inworld-tts-1.5-max $10/M characters $0.01/min Most applications. Great balance of optimal quality and latency.
inworld-tts-1.5-mini $5/M characters $0.005/min Extremely latency-sensitive applications.What Inworld TTS 1.5 unlocks: use-case inspiration
Bible Chat, Particle, Luvu, Talkpal, Astrobeam, and many others are proving what is possible when developers have access to consumer-grade voice AI. The combination of sub-200ms latency, benchmark-leading quality, and accessible pricing opens new possibilities:
- Conversational AI agents: Build voice assistants that respond naturally, without the awkward pauses that break immersion. TTS-1.5's speed makes multi-turn conversations feel genuinely fluid.
- Real-time translation and dubbing: Live interpretation requires voice synthesis that keeps pace with speakers. TTS-1.5 delivers the latency profile that makes real-time language bridging viable at scale.
- Interactive entertainment: From AI companions to narrative experiences, TTS-1.5 enables characters that speak with emotional range and contextual awareness, responding in real-time to user input.
- Accessibility applications: Screen readers, navigation aids, and assistive technologies benefit from natural-sounding speech that doesn't fatigue listeners or create cognitive load.
"We're blown away with Inworld’s latest models which achieve unmatched voice realism at a fraction of the cost. We’re excited to bring these models to Layercode where developers can create and deploy realtime latency, life-like voice agents with them." - Damien Tanner, CEO, Layercode
Enterprise-ready deployment options, now supporting On-Prem
TTS 1.5 supports the deployment flexibility enterprises require:
- Cloud API: Immediate access via our standard API with global availability
- On-prem deployment: Full model hosting on your infrastructure
- Custom solutions: Contact our enterprise team for volume pricing, SLAs, and tailored deployment architectures
For organizations with strict data residency requirements or regulatory constraints, on-premise deployment provides complete control over voice synthesis without sacrificing capability.
TTS 1.5 is also available now via Layercode, LiveKit, NLX, Pipecat, Stream Vision Agents, Ultravox, Vapi, and Voximplant.
"I see an inflection in the not-so-distant future where conversational voice becomes the primary interface. The exciting thing is a lot of the technologies, like Inworld's realtime TTS, that need to come together to make this a reality are already here. And they're only getting better. So it's super exciting to operate in this space with partners like Inworld setting the pace for this innovation." - Andrei Papancea, CEO & Co-Founder, NLX
Get started today
TTS 1.5 is available now:
- Try the TTS Playground: Hear TTS 1.5 in action with your own text or clone with a voice sample
- Read the documentation: API reference, SDKs, and integration guides
- Contact enterprise sales: Volume pricing, on-premise options, custom voice development
We're just getting started. TTS 1.5 represents our most significant voice AI release yet, and the foundation for what's coming next. We can't wait to see what you build.
Questions? Reach out to our team today.
Original source Report a problem All of your release notes in one feed
Join Releasebot and get updates from Inworld and hundreds of other software products.
- Nov 25, 2025
- Date parsed from source:Nov 25, 2025
- First seen by Releasebot:Jan 2, 2026
Runtime v0.8
New runtime launch delivers faster, smarter realtime agents with lower latency and instant streaming, plus smart early stopping for safer, cost-aware interactions. Production-ready templates and quick start onboarding unify setup and observability in one place, speeding adoption.
Build faster, smarter realtime agents - instant streaming, lower latency, and smart interruption handling
1. Runtime V0.8
Built for any use case: companions, language tutors, customer support, fitness trainers, games, and more.
- Lower Latency Agents : Core runtime optimizations reduce latency significantly, making live multimodal agents feel snappier even under heavy LLM and TTS loads.
- Instant streaming responses: Graph start is asynchronous now, enabling agents to begin streaming tokens or audio as soon as a run kicks off, eliminating awkward silence at the start of each interaction.
- Smart early stopping: Cancel an agent run mid‑response for barge‑in, safety, or cost control so agents stop talking the moment the user or policy requires it.
For new users: Get Started
For returning users: Follow the 0.6 -> 0.8 migration guide
2. Template Library
Production‑ready in one click
- Launch from full example projects: Clone production‑ready multimodal templates directly into your project and go from idea to running agent in minutes.
- Find the right template fast: Filter by input/output modality, SDK, and use case to jump straight to the patterns that match your product.
Start Building with Templates on Portal
Start Building with Templates on Website
Don’t find the template you’re looking for? Talk to our team to request one
3. Overview Page
API keys, onboarding, and application health in one place
- Personalised onboarding by SDK: See a tailored “getting started” flow for your chosen SDK, so you only follow the steps that matter to your stack.
- Everything you need, front and center: Jump straight to your API key, llms.txt, and templates from the home view, cutting setup time from minutes to seconds.
- Holistic observability at a glance: View key traces and logs so you can catch regressions, debug incidents, and keep AI experiences reliable.
Personalized Onboarding
Observability at a Glance
Get Started
Talk to our team
- Nov 20, 2025
- Date parsed from source:Nov 20, 2025
- First seen by Releasebot:Dec 23, 2025
Node.js Runtime v0.8.0
Enhanced performance, execution control, and component access for custom nodes.
- 2x faster performance with optimized addon architecture
- Cancel running executions with abort() on GraphOutputStream
- Call LLMs from custom nodes via getLLMInterface() and getEmbedderInterface()
- Build stateful graph loops with DataStreamWithMetadata
Breaking changes
graph.start() is now async, and stopInworldRuntime() is required.
See the Migration Guide for upgrading from v0.6.
Original source Report a problem - Nov 6, 2025
- Date parsed from source:Nov 6, 2025
- First seen by Releasebot:Dec 23, 2025
Introducing Timestamp Alignment, WebSockets and More for Inworld TTS
Inworld TTS debuts a major release with speed boosts, multilingual expansion to Russian, API voice cloning, custom voice tags and pronunciation controls, plus WebSocket streaming for low latency and precise timestamping for lipsync.
Performance improvements - now #1 on Artificial Analysis TTS Leaderboard
Speed and quality are critical for real-time voice. Inworld TTS is now faster, smoother, and more natural across production workloads. Inworld TTS 1 Max just ranked #1 on the Artificial Analysis Text to Speech Leaderboard, which benchmarks the leading TTS models on realism and performance.
Quality improvements
New TTS models deliver clearer, more consistent, and more human-like speech.- Clearer articulation: Lower word error rate (WER) and better intelligibility on long or complex sentences.
- Improved voice cloning: Higher speaker-similarity scores; voices retain tone, pacing, and emotion even across languages.
- More accurate multilingual output: Fewer accent mismatches and more natural pronunciation across supported languages.
Latency improvements
We’ve reduced latency across multiple layers of our stack:- Infrastructure migration: New server placements cut internal round-trip time by ~50 ms, especially benefiting users in the US and Europe.
- Optional text normalization: Disable text normalization in the API to save 30–40 ms for English (up to 300 ms on complex text) and up to 1 sec in other languages.
- WebSocket streaming: Persistent connections reduce handshakes, enabling faster starts and smoother real-time dialogue.
- Faster inference: Inworld TTS Max now runs on an optimized hardware stack, enabling responses that are ~15% faster.
WebSocket support
For real-time conversational applications, our new WebSocket API offers persistent connections with comprehensive streaming controls.
HTTP requests work fine for simple TTS, but they add overhead when you're building voice agents, interactive characters, or phone call agents, as each request requires connection setup.
WebSockets keep a persistent connection open. You can stream text as it arrives from your LLM, maintain conversation context, and handle interruptions gracefully.
Three ways WebSockets give you more control:- Context management: Run multiple independent audio streams over a single connection. Each context maintains its own voice settings, prosody, and buffer state.
- Smart buffering: Configure when synthesis begins with maxBufferDelayMs and bufferCharThreshold. Start generating audio before complete text arrives, or wait for full sentences.
- Dynamic control: Update voice parameters mid-stream, flush contexts manually, or handle user interruptions without dropping the connection.
Perfect for: - Interactive voice agents that require low latency
- Dynamic conversations where barge-in or interruption support is needed
Timestamp alignment: Sync audio with visuals & actions
Building lipsync for 3D avatars? Highlighting words as they're spoken? Triggering game play actions at specific moments in speech? Handling barge-in and interruptions? You need timestamps.
Timestamp alignment returns precise timing information that matches your generated audio. Choose the granularity that fits your use case:
Use word-level timestamps for:- Karaoke-style caption highlighting
- Triggering character actions when specific words play
- Tracking where users interrupt the AI
- Syncing UI elements with speech
Character-level timestamps are most common for lipsync animation, where they can be converted to phonemes and visemes.
Timestamps currently support English for both streaming and non-streaming, with other languages experimental.
Voice cloning API for programmatic voice creation
Voice cloning is no longer limited to our UI. Now you can create custom voices directly through the API. Available in beta to select customers.
Why this matters:
If you're building a platform where end users need to clone their own voices, you can now integrate that experience directly into your app, without redirecting users to Inworld's interface. You can also create voices in bulk using a simple script.
Use cases:- Games where players create their own character voices
- Social platforms where users create their own avatars
- Games or call centers where a large number of voices need to be created in bulk from pre-recorded audio samples
Voice cloning APIs enable third-party platforms to offer voice creation as a native feature in their own workflows or create voices in bulk.
Custom voice tags
When creating a custom voice in the UI or API, we now allow users to apply tags to their voices for grouping and filtering.
Why this matters:
You can now easily manage a large database of voices and filter for the appropriate voice at runtime, which is highly valuable in games and related applications, where characters are often generated on the fly.
Use cases:- Gaming platforms where characters are generated on the fly and need to be matched to an appropriate voice
- Enterprise apps where the optimal voice is chosen at runtime based on the user profile
- Applications that are still in development, where managing and iterating on a large number of voices is an essential workflow in the design process
Voice tags are the first step toward a larger voice library and management system.
Custom pronunciation: Say it your way
Getting AI voices to pronounce words correctly matters. Brand names, character names, technical terms, and regional dialects are often misspoken by standard TTS models because they aren't represented well in the training data.
We now allow users to manually insert phonetic notation into their text, allowing for consistent and accurate pronunciation of key words. Not sure what phonemes to use? Ask ChatGPT or your favorite AI assistant for the IPA transcription, or check reference sites like IPA Pronunciation Guide | Vocabulary.com
Common use cases:- Brand names that need to sound perfect every time
- Unique names
- Medical, legal, or technical terminology
- Regional pronunciation variations
- Fictional locations and proper nouns
We support International Phonetic Alphabet (IPA) notation.
Russian support and multilingual improvements
Inworld TTS now speaks Russian, bringing our total to 12 supported languages. All supported languages include English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Dutch, Polish, and Russian.
Clone a voice and label it as Russian, or choose one of our pre-built Russian voices. As with all languages, voices perform best when synthesizing text in their native language, though cross-language synthesis is possible.
We've also made quality improvements across all non-English languages. Better pronunciation accuracy, more natural intonation, and smoother speech patterns.
For multilingual applications, Inworld TTS Max delivers the strongest results with superior pronunciation and more contextually-aware speech across languages.Try these features today
All features are available now through our API and TTS Playground, at the same accessible pricing.
Get started:- Try TTS Playground
- Read the docs
- You can also access Inworld voices and text-to-speech models via LiveKit, NLX, Pipecat, and Vapi.
Frequently asked questions
How do I convert timestamps to visemes for lipsync?
The typical pipeline: character timestamps → phonemes (using tools like PocketSphinx) → visemes (using your game engine's mapping). Our timestamps provide the timing foundation.How do I gracefully handle interruptions with websocket?
The WebSocket endpoint supports multiple independent contexts, enabling seamless barge-in handling. When a user interrupts, you can start a new, independent context and send the post-interruption agent response to it. The old context can be closed when the interruption occurs.What are some techniques to optimize end-to-end latency?
Original source Report a problem
To reduce latency, consider using the TTS streaming API, keeping a persistent WebSocket connection, and disabling text normalization by instructing your LLM to create speech-ready text via a system prompt. - Nov 4, 2025
- Date parsed from source:Nov 4, 2025
- First seen by Releasebot:Dec 23, 2025
The 3 Engineering Challenges of Realtime Conversational AI
Inworld launches Runtime, a low latency AI backend for real time conversational AI. Build with the SDK, deploy hosted endpoints, and run live A/B experiments with automatic traces to cut development time and improve user experience.
The Vision
Every builder in conversational AI shares a common goal: to create systems that feel natural, responsive, and personalized. But in practice, we spend more time wiring APIs, debugging, and optimizing latency rather than optimizing user experience.
Inworld Reflection
We started at Inworld by building lifelike AI characters that gamers loved—ones that could remember, converse naturally, and feel real.
As our customer base expanded beyond games, they asked for complex customizations—to plug in their own models, connect to proprietary data, define custom emotions, routing, and more.
With each request, our engineering teams spent less time on shipping user features and more time writing integrations and debugging.
This realization led us to a critical analysis of where our development time was truly going.That analysis revealed three recurring engineering pain points in building realtime conversational AI, and we built Inworld Runtime to solve them.Inworld Runtime
Inworld Runtime is a low-latency AI backend for realtime conversational AI. You build your conversational AI with Inworld Runtime SDK, launch a hosted endpoint using Inworld CLI, and observe and optimize your conversational AI by running A/B experiments in the Inworld Portal.
The 3 Challenges of Realtime Conversational AI
Problem 1: Latency that breaks the realtime feel
Before: High latency under high loads
- Scaling issues: As apps scaled to thousands of users, latency spiked above one second.
- Blocking operations: Many popular programming languages, while excellent for rapid prototyping, have runtime limitations that prevent true parallel execution, leading to blocked operations when we need to run multiple LLM calls, embeddings, and processing tasks concurrently.
With Runtime: True parallel execution at the C++ core - Parallel execution: Using Runtime, an agent can embed user input, retrieve knowledge, and do a web search all at once, then proceed to the LLM call, dramatically reducing end-to-end latency.
- Pre-optimized backend: The graph executor automatically identifies nodes without dependencies and schedules them in parallel — no manual threading code required.
- Read Streamlabs case study: Built a realtime multimodal streaming assistant with sub-500 millisecond latency
Example Node.js LLM -> TTS pipeline - Low Latency with C++ optimized backend
Problem 2: 50% dev time spent in integration and debugging
Before: Repetitive, time-consuming tasks
- Wrote repetitive integration code: For every new feature that required integrating an AI model, we found ourselves writing similar integration code.
- Reconstructed execution paths by hand: When an agent's behavior was incorrect, our primary tool for analysis was traditional logging. We had to sift through disconnected logs from various parts of the codebase to manually reconstruct the sequence of events.
- Coupled orchestration and business logic: The control flow for handling model responses, error retries, and feature-specific logic—like updating how fallback responses were triggered—was deeply embedded within the business logic, making even minor feature updates risky. Bringing new developers up to speed took weeks instead of days.
With Runtime: Less Maintenance, More Iteration - Build fast with pre-optimized nodes: Developers get a full suite of nodes to construct realtime AI pipelines that can scale to millions of users, including nodes for model I/O (STT, LLM, TTS), data engineering (prompt building, chunking), flow logic (keyword matching, safety), and external tool calls (MCP integrations).
- View end-to-end traces and logs automatically: Instead of reconstructing the execution path manually, developers simply go to Inworld Portal to view the end-to-end trace and logs. Every node execution is automatically instrumented with OpenTelemetry spans capturing the node, inputs, outputs, duration, and success/failure.
- Write modular, easy-to-understand code: Developers define each node’s inputs, outputs, and dependencies in a graph, making the execution path explicit and visible — you can see exactly which nodes connect to which others, making onboarding new team members easy. They can contribute to a single node on day one, then gradually understand the broader graph structure.
- Read Wishroll Status Case Study: Went from prototype to production in 19 days with a 20x cost reduction
Automatic Traces for Each Graph Execution
Problem 3: Slow iteration speed
Before: Customization incurred technical debt
- Bespoke customization: As our customer base grew, so did the need for customization, so the code became brittle and hard to reason about.
- If/else hell: Different clients required slightly different logic, tools, or model choices. In our traditional codebase, this led to a labyrinth of if/else code blocks and feature flags scattered throughout the logic.
With Runtime: Fast user experience iterations - One-line change for models and prompts: Want to swap an LLM provider or adjust a model parameter? That's a simple configuration change. A/B test variations and deploy customizations without touching production code.
- A/B testing at scale: We define agent behavior declaratively in JSON or through a fluent GraphBuilder API. Different clients get different graph configurations—not different code paths.
Live A/B Test: 50% traffic split to 2 models to observe what your users prefer
Why We're Sharing Inworld Runtime with You
We built Inworld Runtime to solve our own massive challenges in creating production-grade, scalable realtime conversational AI. But in doing so, we created a solution for a problem every AI developer faces: managing the inherent complexity of the "reason—act" agent cycle.
We believe the future of AI is not just about more powerful models, but better orchestration. It's about giving developers the architectural foundation they need to build robust, maintainable, and observable realtime conversational AI, without reinventing the wheel.
If you're tired of wrestling with tangled logic and want to focus on creating, we invite you to build your next experience on Inworld Runtime. Let us handle the complexity of orchestration, so you can focus on bringing your ideas to life.Get Started with Inworld Runtime
Inworld Runtime is the best way to build and optimize realtime conversational AI and voice agents.
You can build realtime conversational AI that is fast, easy to debug, and easy to optimize via A/B experiments.
Get started now with Inworld CLI:- Build a production-ready conversational AI or voice agent
- Deploy it to Inworld Cloud as an endpoint so you can easily integrate it into your app
- Monitor dashboards, traces, and logs in the Inworld Portal
- Improve user experience by running live A/B experiments to identify the best model and prompt settings for your users
Talk to our team
Original source Report a problem - Oct 29, 2025
- Date parsed from source:Oct 29, 2025
- First seen by Releasebot:Jan 2, 2026
Unreal AI Runtime: The first unified interactive AI toolkit for game developers
Inworld launches the Unreal AI Runtime SDK, a unified game AI runtime with STT, TTS, LLMs, a visual graph editor, and pre-built templates. It unifies dozens of providers under one API key, adds observability, and ships with Unity Runtime early access. This marks real product release progress.
Unreal AI Runtime SDK Launch
Today, we’re excited to be launching our Unreal AI Runtime SDK - the first unified solution for game developers building realtime interactive AI experiences. No more stitching together multiple AI plugins, managing separate provider APIs, or spending all your time fighting AI instead of building your game.
With the Unreal AI Runtime, you can now:- Start working immediately with foundational AI building blocks like speech-to-text (STT), text-to-speech (TTS), and LLMs
- Get access to hundreds of models across various model providers with a single API key
- Easily create AI pipelines, such as a speech to speech pipeline, with an intuitive visual graph editor
- Leverage pre-built templates for common use cases like AI NPCs and chatbots
- Manage and optimize costs, latency, and quality through built-in observability and experimentation tools
The Unreal AI Runtime SDK is live now and available for download. We are also launching our Unity AI Runtime SDK for early access.
Why we built this
Inworld has been building for game developers since day one — starting with lifelike, interactive characters that could form relationships, express emotions, and respond naturally through voice. Along the way, we saw the biggest challenges developers faced when building engaging, realtime AI experiences:
- Keeping up with dozens of models and providers - Delivering a rich, multimodal AI often requires testing dozens of different models. In addition to integrating each new provider, this means juggling separate APIs, rate limits, billing. The Unreal AI Runtime SDK unifies access to all major model providers under a single API key, so switching models is as easy as selecting from a dropdown.
- Easily customizing without reinventing the wheel - Every game has unique creative and technical needs. With pre-built templates (including our Character Template) and a modular design, you can customize or extend for your specific use case while starting from a production-ready, pre-optimized foundation. Plus, our intuitive visual graph editor makes it easy for anyone—writers, designers, or developers—to customize logic, prompts, and behavior that define your AI interaction.
- Debugging non-deterministic AI responses - AI systems can behave unpredictably, and debugging the root cause is notoriously hard. The Unreal AI Runtime's built-in observability tools, including logs, traces, and dashboards, make it easy to trace every step of your AI pipeline, understand latency, and pinpoint exactly where an issue occurred.
- Managing costs with scale - Scaling AI to millions of players can get expensive. The Unreal AI Runtime helps manage those costs with dashboards that provide a unified view of your cost drivers, experimentation tools that make it easy to test more efficient models or orchestrations, and support for running local models where it makes sense.
The Unreal AI Runtime was built to address these challenges for game developers, so you can spend more time building your game instead of building AI infrastructure.
What you can build
Engaging, conversational NPCsDeveloper: Inworld team
Generative Survivor’s game
SDK: Unreal AI Runtime
Motivation: Building a believable, real-time character is hard. Low-latency turn-taking and natural interruptions are notoriously difficult to get right. We also wanted to support MetaHumans and lipsync, with our streamed audio. Then there’s always the question of balancing LLM latency with quality - how do we ensure the dialogue is engaging and relevant without being too slow or expensive? We designed our Character, Metahuman, and Lipsync templates to handle all of this out of the box, with countless hours spent optimizing for performance, realism, and responsiveness. It’s the fastest way to build a fully interactive, voice-driven character in Unreal.Developer: Brian Cox, Shuang Liang
AI-powered decision making
SDK: Unity AI Runtime
Motivation: I wanted to challenge myself to create a fully generative game. While today’s AI still struggles to build an entire game from scratch, I approached the problem using data-driven development. I created a core game template (in this case, a Survivor-like experience) and set up internal asset libraries covering 3D models, animations, VFX, music, fonts, and more. Each asset is tagged with metadata describing its visual and thematic attributes.
Using the Unity AI Runtime SDK, I built a conversational system where I can speak to an NPC (Merlin) and describe what type of game I want: player character, enemy types, starting weapon, environment, skybox, and so on. This voice input triggers a series of LLM service calls, each using specialized prompts to search the asset libraries and select the most fitting options.
Once the AI has chosen the final assets, the configuration is stored in a ScriptableObject. When the game returns to the main menu, the ScriptableObject is parsed and all required data is extracted to dynamically generate a customized Survivor-like game on the fly.Developer: Braeden Warnick
SDK: Unreal AI Runtime
Motivation: I wanted to show that while other AI-powered NPCs can obey commands based on what they can or can't do (like open locked doors), I wanted to explore making an NPC companion that can behave based on what they should do based on their character info. This also illustrates how you can adapt a custom Graph to leverage the evaluator power of an LLM in place of any scoring function used in any game system (e.g., procedural content generation, NPC decision making, game AI directors).Get started with Inworld Runtime
To get started building today:
- Download the AI Runtime SDK for Unreal or Unity and read the documentation
- Explore the Character template to add conversational capabilities to your NPCs
- Talk to our team for enterprise support.
We’re excited to see what games and experiences you bring to life!
Original source Report a problem - Oct 22, 2025
- Date parsed from source:Oct 22, 2025
- First seen by Releasebot:Dec 23, 2025
Introducing Inworld CLI
Inworld launches the Inworld CLI, a unified toolkit to build, deploy, and optimize realtime conversational AI. Expect faster performance, easier debugging, and live A/B testing from the command line with integrated telemetry and production endpoints.
Challenge
Until now, building realtime conversational AI meant facing:
- Performance Bottlenecks: Unpredictable latency from third-party APIs creates a jarring user experience. This is compounded by core language limitations, like Python's GIL, that block parallel execution and stall critical operations.
- High Development Overhead: Engineering resources are drained by maintenance. Teams spend more time debugging provider failures and integrating a complex patchwork of models than building new features, causing product velocity to stagnate.
- Slow Iteration Speed: Scattered conditional logic for different models and clients makes the entire system fragile. This fragility makes every change high-risk, paralyzing rapid A/B testing and stalling product improvements.
Inworld faced these very challenges as our customer base grew and expanded beyond games and into mobile apps, voice agents, ai companions, and more. We hence built Inworld Runtime to solve them.
Inworld Runtime
Inworld Runtime is the AI backend for realtime conversational AI. You build your conversational AI with Inworld Runtime SDK, launch a hosted endpoint using Inworld CLI, and observe and optimize your conversational AI by running A/B experiment in the Inworld Portal.
Today, building with Inworld Runtime just became easier with the launch of Inworld CLI.Inworld CLI
With Inworld CLI, developers can now build realtime conversational AI that are fast, easy to debug, and easy to optimize via A/B experiments.
- Build realtime experiences
npm install -g @Inworld.ai/clito install the Inworld CLIinworld loginto login and generate api keys automaticallyinworld initto initialize conversational AI pipelines such as LLM -> TTS - preoptimized for latency and flexibilityinworld runto test locally with instant feedbackinworld deployto create persistent, production-ready endpoints
- Monitor with clarity
- Integrated telemetry: Each request is automatically logged in dashboards, traces, and logs in Inworld Portal.
- Optimize continuously
inworld graph variant registerto run live A/B tests without client changes
Proven technology
Since launching Inworld Runtime earlier this year, we've seen developers build incredible realtime conversational AI experiences.
- Wishroll went from prototype to 1M users in 19 days with 20x cost reduction.
- Streamlabs built a real-time multimodal streaming assistant with under 500ms latency.
- Bible Chat scaled their AI-native voice features to millions. Inworld CLI builds on Runtime to help developers build agents more efficiently and reliably.
Get started with Inworld Runtime
Inworld Runtime is the best way to build and optimize realtime conversational AI and voice agents
Get started now with Inworld CLI:
- Build a prod-ready conversational AI or voice agent
- Deploy it to Inworld Cloud as an endpoint so you can easily integrate into your app
- Monitor dashboards, traces, and logs in the Portal
- Improve user experience by Run live A/B experiments to identify the best model and prompt settings for your users
Talk to our team
Original source Report a problem - Oct 15, 2025
- Date parsed from source:Oct 15, 2025
- First seen by Releasebot:Dec 23, 2025
The new AI infrastructure for scaling games, media, and characters
Inworld launches Runtime, a high‑performance AI pipeline that scales voice‑forward, character‑driven experiences to millions. It connects LLM, STT, TTS with remote config, telemetry and multi‑vendor support; Unreal is available now in early access, Unity coming soon.
TTS is FREE for December! Plus 2.5x referral bonuses. $25 for you, $25 for them.
The new AI infrastructure for scaling games, media, and characters
Built on gaming and media innovation
We began by pushing the frontier of lifelike, interactive characters for games and entertainment, and this remains a core focus area. Today, Inworld powers real‑time, voice‑forward experiences and provides the infrastructure that lets those experiences scale from a prototype to millions of players without sacrificing quality. Partners across the industry, including Xbox, NVIDIA, Ubisoft, Niantic, NBCUniversal, Streamlabs, Unity, and Epic, have built with Inworld to explore new gameplay and audience experiences.
Long before “chat with anything” became a category, we were shipping playable, character‑centric demos and engine integrations that let teams imagine worlds where characters remember, react, and stay in‑world. That early craft in character design is still our foundation, and it is why leading studios and platforms continue to collaborate with us on the next generation of character‑driven interactive media.From demos to production: Deeper control through a new AI infrastructure
As partners moved from impressive demos to live titles, we hit the same wall every game team hits: keeping voice, timing, and consistency flawless at scale, which is what players actually feel. Text‑only stacks and one‑off integrations were not built for real‑time, multimodal workloads, and stitching providers together left developers without enough control to maintain user‑facing quality as usage spiked or audiences expanded.
That is why we built Runtime: to put developers in control of the entire pipeline, and to make measurement and experimentation first‑class, so quality can be maintained and extended to new geographies and demographics, with personalization where it matters.What is Inworld Runtime and how does it help you scale?
Inworld Runtime is a high‑performance, C++ graph engine (with SDKs like Node.js and Unreal) that orchestrates LLMs, STT, TTS, memory or knowledge, and tools in a single pipeline. Build a graph in code, ship it, then iterate with remote configuration, A/B variants (via Graph Registry), and built‑in telemetry without redeploying your game. It is the infrastructure we developed to support experiences with millions of concurrent users, now available to all developers.
Why this gives you more control and keeps quality tangible for users- Provider‑agnostic nodes so you can swap models and services without glue‑code churn or lock‑in.
- Remote config and Graph Registry to change prompts, models, and routing live, safely rolled out to cohorts.
- Targeted experiments to validate interaction quality for new geographies and demographics, including voices, timing, interruptions, prompts, and routing, and enabling personalization by segment.
- Observability for player‑perceived quality with traces, dashboards, and logs that expose latency paths, first‑audio timing, and lip‑sync cadence so you fix what users actually feel.
The approach is simple: one runtime for your entire multimodal pipeline (e.g. STT → LLM → TTS → game or media state), with observability and experimentation to optimize quality, latency, and cost for every audience.
Voice that keeps up with gameplay
Many teams start with TTS, then expand into full pipelines as they localize, personalize, and harden for live ops, testing variations for new geographies and demographics and locking in what works.
Inworld TTS delivers expressive, natural‑sounding speech built for real‑time play. You get low‑latency streaming, instant voice cloning, and timestamp alignment for lip‑sync and captions, plus multi‑language coverage and integrations with LiveKit, NLX, Pipecat, and Vapi for end‑to‑end real‑time agents. Pricing starts at $5 per 1M characters, so you can scale voice across large audiences.
Try the TTS Playground or call the API to integrate quickly.Proven at scale with industry leaders in games and media
- Xbox × Inworld: Multi‑year co‑development to enrich narrative and character creation for game developers.
- Ubisoft (NEO): Prototype showcased real‑time reasoning, perception, and awareness in characters powered by Inworld tech.
- NVIDIA (Covert Protocol): Social simulation and hybrid on‑device or cloud capabilities using NVIDIA ACE with Inworld.
- Niantic: From Wol to WebAR, teams used Inworld to bring AI characters into spatial experiences.
- Streamlabs: Intelligent streaming assistant jointly powered by Streamlabs, NVIDIA ACE, and Inworld generative AI.
- NBCUniversal and other media leaders: Runtime opened after building infrastructure to meet their scale and quality bars.
Continuing our character leadership at scale
We pioneered character‑first, real‑time interaction years before today’s wave. That DNA is alive and well, and now it is backed by an infrastructure layer that gives developers more control and a better fit for modern production: Runtime for orchestration and TTS for voice that performs under pressure. If you knew us for our previous character stack, you will find this generation faster to ship, safer to iterate, and easier to scale.
Learn more about how to create characters with RuntimeHow do I get started with Inworld Runtime?
- Explore the Runtime Overview for graphs, experimentation, and observability
- Try our Templates for Node.js CLI, Voice Agent, Language Learning, and Companion apps
- Test TTS capabilities in our TTS Playground
- Check integrations with LiveKit, Pipecat, Vapi, and NLX
- Contact our team if you're scaling voice-first or character-driven experiences
Unreal (Runtime) is available now for early access. Unity is coming soon. If you are scaling a voice‑first or character‑driven experience in games or media, we would love to help you map the pipeline and quality targets that matter for your audience. Start with the Runtime Overview and Templates.
Powering the future of interactive media
We are uniting believable AI characters and worlds with the runtime required to run them at multi‑million‑user scale. Build the worlds you want, with characters that truly come alive and stay alive, at scale.
Original source Report a problem - Oct 6, 2025
- Date parsed from source:Oct 6, 2025
- First seen by Releasebot:Dec 23, 2025
Inworld CLI - Hosted Endpoint
npm install -g @inworld/cliQuickstart
- 3-Minute Setup: Single command installation, browser-based login, and instant API key generation.
- Local Development: Test your graphs instantly with inworld serve.
- Instant Deployment: Deploy to cloud with inworld deploy - no hosting, scaling, or infrastructure required.