Mistral Release Notes
89 release notes curated from 34 sources by the Releasebot Team. Last updated: Jun 5, 2026
Mistral Products
- Jun 4, 2026
- Date parsed from source:Jun 4, 2026
- First seen by Releasebot:Jun 5, 2026
v1.11.3: Fix continue_final_message, add reasoning format to to_openai
Mistral Common ships 1.11.3 with expanded reasoning format support for OpenAI conversions, preserved zero seed handling, and fixes for tokenizer guidance and tekken normalizers. The release also includes dependency and pre-commit updates.
What's Changed
- Raise multiple format of reasoning for from_openai by @juliendenize in #224
- Preserve zero OpenAI seed in chat request conversion by @pragnyanramtha in #226
- Pin uv required-version and bump pre-commit hook by @juliendenize in #228
- Add to_openai reasoning format for AssistantMessage by @juliendenize in #223
- fix(tokenizer): point users at from_hf_hub on unknown model (#229) by @NishchayMahor in #231
- Fix: forward continue_final_message in tekken normalizers (V7/V15) by @matdou in #233
- Version 1.11.3 by @juliendenize in #239
New Contributors
- @pragnyanramtha made their first contribution in #226
- @NishchayMahor made their first contribution in #231
- @matdou made their first contribution in #233
Full Changelog: v1.11.2...v1.11.3
Original source - Jun 3, 2026
- Date parsed from source:Jun 3, 2026
- First seen by Releasebot:Jun 4, 2026
v1.11.2: Improve from_openai method.
Mistral Common improves from_openai methods and adds tests and docstrings.
What's Changed
- Add test and docstring to get_validator by @juliendenize in #219
- Improve from openai methods by @juliendenize in #221
- Version 1.11.2 by @juliendenize in #222
Full Changelog: v1.11.1...v1.11.2
Original source All of your release notes in one feed
Join Releasebot and get updates from Mistral and hundreds of other software products.
- May 28, 2026
- Date parsed from source:May 28, 2026
- First seen by Releasebot:May 28, 2026
Vibe gets to work.
Mistral launches Vibe as one AI agent for work and code, with Work Mode for long-running tasks, Code Mode for remote coding and pull requests, plus a new VS Code extension and CLI updates for deeper project-wide automation.
Highlights
- Le Chat is now Vibe—one agent and one licence across work and code, with every conversation, setting, and plan carried over.
- Work Mode, available on web and mobile, is a powerful agent for long-range tasks that picks the right tools, streams progress, and completes complex work to finish.
- Code Mode launches remote coding agents from a dedicated web surface.
- A new Mistral Vibe extension for VS Code; the coding agent working across your whole project, inside your IDE.
Vibe for work
In Work Mode, Vibe is your AI agent for complex, multi-stage tasks, fluent in your knowledge, apps, and tools. It maps out a plan and gets your sign-off before it starts, then works across your connectors to carry the task through.
For one person, this turns a morning of admin into a single prompt—catch up on what you missed, pull the numbers, draft the update, and have it ready to send on. Whereas, for the organisation, the same agent runs the processes that keep a business moving, grounded in the documents, mailboxes, and systems your teams already use, and governed by the permissions you set at the admin-level.
Here is what Vibe’s Work Mode does today.
- Enterprise knowledge search: Vibe deeply grounds its work in your context, reaching across Google Workspace, Outlook, SharePoint, Slack, GitHub, and any custom connectors or libraries.
- Structured data analysis: connect a database or upload a spreadsheet, and Vibe surfaces the patterns, anomalies, and signals you asked about, rendering charts and dashboards inside the conversation.
- Document and report synthesis: Vibe drafts the deliverable using the Canvas tool, from a one-page brief to a report, an RFP response, or a board deck, ready for you to edit and push to Notion, SharePoint, or your inbox.
- Multi-step task scheduling: set a prompt to run once or on a daily, weekly, or monthly cadence, and a notification lands when a run finishes.
- Reusable skills: extend Vibe with preset or custom skills via open standards to automate your repeatable workflows with consistency and sequential precision.
Every step is visible as it happens, and each tool call and reasoning chain is expandable to its inputs and outputs.
Vibe for code
Code Mode is the new coding surface in the Vibe web app, where you can connect to GitHub, manage your projects, start sessions, and see them through to a pull request. Inside a session, the agent runs in an isolated sandbox, and you manage sensitive actions and inspect diffs as it’s writing code.
Sessions can run in parallel, can persist while your machine is off, and can be triggered from third-party apps, such as Slack (coming in June)—in addition to your code editor or the Vibe CLI.
The new Mistral Vibe extension for VS Code
We’re also releasing a new plugin that brings the Vibe coding agent into VS Code. Vibe works across your whole project in a side panel that reads, edits, and executes commands beside your files. Open files attach automatically, selections can be line-ranged and @ mentions will pull in more context from other directories or files.
The extension runs on the same harness as the CLI, and features are replicated across both surfaces:
- write unit and integration tests that match your existing patterns, and document what it ships, from the README down to inline comments;
- refactor and translate, moving a module to a new pattern, or a legacy file to a more modern language, with behaviour preserved and tests completed;
- connect to your entire stack, pulling context from GitHub, GitLab, Jira, or Linear, so a change arrives with the issue it resolves and the conventions your team follows.
Updates to Vibe CLI
The Vibe CLI is getting several important updates, including more session controls.
- Skills turn repeatable workflows into / commands.
- Custom modes and subagents route specialised work within a session.
- The agent's plan is editable before it runs, and it can ask multiple choice questions mid-run.
- Permissions are session-scoped (always, never, or ask), including overrides for files, commands, and directories.
- /teleport moves a live session between your terminal and the cloud, keeping history and approvals intact.
Get started with Vibe
Vibe is live at chat.mistral.ai. Download the mobile app from the App Store and Google Play. If you were on Le Chat, your plan, history, and settings are already there within the Chat mode. *
- Free: quick answers and simple everyday tasks.
- Pro, $14.99/month: complex tasks, deeper reasoning, and all-day coding.
- Team, $24.99/user/month: a shared workspace with admin controls and more storage.
- Enterprise: custom deployments, model training, and dedicated solutioning.
See Vibe’s plans to learn more.
For code, Vibe is on the web at code.mistral.ai, in VS Code through the new extension, and in your terminal with the CLI:
curl -LsSf https://mistral.ai/vibe/install.sh | bash uv tool install mistral-vibeYou can also build through the API in Mistral Studio, with the free Experiment plan for testing and prototyping.
Original source - May 28, 2026
- Date parsed from source:May 28, 2026
- First seen by Releasebot:May 28, 2026
Introducing Search Toolkit
Mistral releases Search Toolkit in public preview, a composable open source framework for building production search pipelines for AI apps. It unifies ingestion, retrieval, and evaluation with built-in hybrid search and metrics to help teams improve search quality.
Search Toolkit in public preview
Today, we're releasing Search Toolkit in public preview. Search Toolkit is a composable framework for building production search pipelines for AI applications. We built it because teams building search infrastructure still spend too much engineering time on plumbing. Most stitch together separate tools for ingestion, retrieval, and evaluation, each with its own interface and its own assumptions about data. Search Toolkit brings all three into a single framework with a shared interface, so teams spend their time improving search quality instead of maintaining integrations. Search Toolkit is open source and runs wherever your infrastructure does. Cloud, on-premises, edge.
Search infrastructure is still harder than it should be.
Most teams building retrieval systems spend more time assembling infrastructure than improving search quality. Ingestion requires one set of tools. Retrieval requires another. Evaluation, if it happens at all, is bolted on with a separate framework and separate assumptions about data shape. Teams report weeks of integration work before they can run a single query against their own data. Measuring whether the retriever is returning the right results often requires yet another toolchain. For organisations building RAG workflows or internal knowledge systems, that overhead multiplies at every layer.
Where it fits.
Enterprise search.
Most organisations don't have a search problem. They have a dozen search problems. Internal wikis, support ticket systems, document repositories, file storage, codebases. Each source has different structure, different metadata, and needs different processing to index well. Teams typically end up building a separate ingestion pipeline for each one, with its own parsing logic, its own chunking strategy, and its own assumptions about what a "document" looks like. The result is a set of isolated indexes that can't be searched together, or a brittle custom layer that tries to unify them and becomes its own maintenance burden. Search Toolkit provides consistent processing and indexing patterns across source types within a single framework, so teams add new sources without rebuilding the pipeline each time.
RAG and retrieval quality.
When a RAG system returns poor results, the first question is whether the problem is retrieval or generation. In practice, most teams have no clean way to answer that. They tweak prompts, adjust chunking strategies, and swap models without knowing whether the retriever is surfacing the right context in the first place. And even teams that do focus on retrieval often lack the tooling to compare strategies rigorously, on their own data, with their own relevance judgments. The alternative is writing custom evaluation scripts for each experiment. Search Toolkit includes built-in evaluation that measures retriever performance independently, so you can isolate retrieval quality from generation quality and compare configurations as your corpus evolves.
Domain-specific retrieval.
Legal filings, medical records, codebases, financial disclosures. Off-the-shelf retrievers are trained on general-purpose text and tend to struggle with specialised terminology, document structures, and relevance criteria that differ from web search. Teams that need domain-tuned retrieval often end up building custom retrieval infrastructure from scratch, which is expensive to maintain and hard to evaluate.
Search in an agentic world
Agents working on enterprise tasks need access to enterprise context. They make retrieval decisions autonomously and at high volume, so the quality of the search infrastructure underneath them directly affects every downstream step. For searching across large document corpora, agents perform semantic search on an index, which gives them precise results at low latency.
Agents also need live data. With Connectors, they pull directly from source systems like CRMs, code repositories, and productivity tools through MCP integrations. An agent can query an indexed corpus when it needs to search across a large body of content, and pull live data from a source system when it needs the latest state. Search Toolkit gives your agents a high-quality indexed search path to call on alongside live retrieval.
What's inside.
Ingestion.
Index and process data from multiple sources with configurable pipelines. Search Toolkit handles document parsing, chunking, and embedding generation. Custom document formats and preprocessing steps plug in through a standard adapter interface.
Retrieval.
Search Toolkit ships with BM25 sparse retrieval, dense embedding-based retrieval, and hybrid configurations that combine both. Each is configurable to your data and use case.
Evaluation.
Measure search quality with built-in metrics: recall, precision, MRR, and NDCG. Run evaluations against your own test sets, compare retriever configurations side by side, and track quality across releases.
All modules share a common configuration interface. Replace your indexer, swap your retriever, add an evaluator. The rest of the pipeline adapts.
Search Toolkit has been designed for advanced use cases for the enterprise, and battle tested across financial services, manufacturing, public sector, and media & entertainment verticals. CMA CGM uses Search Toolkit alongside Voxtral to help journalists detect fake news. The pipeline processes audio from three distinct data sources and returns alerts within 15 seconds end to end.
Watch the demo
Get started.
The fastest way to try Search Toolkit is with our starter app template.
Prerequisites
Install Docker. You also need uv in the generated project.
Scaffold a new project
uvx copier copy gh:mistralai/search-starter-app my-search-project cd my-search-projectRun it
# Start Vespa locally with Docker make setup-vespa # Index sample data make ingest path=sample_data/hello.txt # Run a query make search query="hello world"The template includes:
- Pre-configured Vespa indexing
- Hybrid retrieval (BM25 + vector)
- Sample data and ingestion pipeline
For full details, see the starter app README.
What’s next
Once you’ve tried the starter app, dive deeper:
- Tune your ingestion pipeline – Configure parsers, chunking strategies, embedding models, and extractors for specific file types to handle your data sources.
- Manage Vespa schema & relevance – Optimize indexing and ranking profiles for your use case.
- Build your dream retrieval – Leverage advanced features like LLM query rewriting, reranking, and hybrid retrieval.
For the full reference, see the Search Toolkit documentation.
Original source - May 27, 2026
- Date parsed from source:May 27, 2026
- First seen by Releasebot:May 28, 2026
Introducing physics AI at Mistral: the foundation for engineering acceleration.
Mistral adds Emmi AI to its enterprise platform, bringing physics AI for industrial engineering. The new capability promises faster simulation, broader design exploration and real-time digital twins for manufacturing, aerospace, energy, semiconductors and other engineering workflows.
Engineering ambition has rarely been greater than it is today. Defense readiness, the energy transition, the push towards sustainable aviation, the need to scale AI data centers, and next-generation chips: every one of these developments depends on engineering teams shipping more capable hardware, faster—with thinner margins for error.
And yet physics analysis remains stuck at the front of the product lifecycle, tied to solver methods that haven't fundamentally changed in decades. Engineers still evaluate a handful of variants when they should be exploring thousands. And once a product is in operation, engineers lose the physics insight they had at design time, because the solvers behind it are too slow to keep up with live data.
We believe physics deserves its own frontier AI models. That's why we've brought Emmi AI into Mistral. In this post, we share what physics AI is, why it matters now, and what it makes possible for our partners like ASML, Airbus, Safran, and Siemens Energy.
We are building out a new foundational capability inside Mistral's enterprise solutions for AI-native industrial engineering—alongside our existing models, our tools for building and operationalizing agentic workflows, and the secure deployment and integration enterprises require.
Together they form a single stack spanning the engineering lifecycle: deployed where the customer needs it, integrated with their environment, fully under their control. Find out more about our AI for manufacturing offering.
The limits of traditional simulation: why engineering is inherently slow
When running these “numerical physics simulations,” engineers use computers to predict how physical systems behave by solving partial differential equations. They are the language of physics: they describe how fluids flow, how structures deform, how heat moves. Rather than building and testing every prototype in the real world, engineers solve the governing physics equations on a computer by dividing an object into millions of tiny pieces and calculating what happens at each one.
A typical CFD or FEM workload looks much the same in 2026 as it did in 2006: prepare CAD geometry, discretize it into a mesh, configure boundary conditions, queue the run on an HPC cluster, wait. The result is a workflow that is slow, taking hours to weeks of compute time per design variant, and expensive: HPC capacity, solver licenses and specialist expertise gate the number of simulations that are being run. True design-space exploration is mathematically possible but economically impossible at this cost and tempo.
The consequence is structural. Engineers iterate on a handful of designs when they should be exploring thousands. Many teams settle for "good enough" because "optimal" is unaffordable in compute and calendar time. Every downstream constraint—manufacturability, certification, cost—compounds in terms of time and cost.
What is physics AI?
Data-driven physics AI is a class of AI models that learn from physics solver outputs and predict physical behavior directly from geometry and boundary conditions, or even measurement data. It maps inputs to full physical fields in a single forward pass, on the order of seconds, on a single GPU.
A few clarifications about what physics AI is not:
- It is not a replacement for first-principles solvers in every regime. It is a step-change in throughput for the vast majority of design-loop iterations, with traditional solvers reserved for verification and edge cases.
- It is not an LLM trained on simulation data. The architectures, training objectives and evaluation regimes are fundamentally different.
- It is not a regression on a single geometry. The point of physics AI is geometric and parametric generalization – one model serving an entire design family, not one model per part.
Now that model architectures allow for industrial scale (see e.g. AB-UPT) and GPUs have become powerful and accessible enough to train and serve physics workloads at production economics, it is the right point to double down from a research and solutions perspective.
What physics AI unlocks
Once inference moves from hours to seconds, both the engineering and operation of products reorganize around what's suddenly possible.
Accelerated product design
This is about the hardware itself—the car body, the wing, the chip package, the motor.
What becomes possible:
- Thousands of design variants explored in the time a single simulation used to take
- AI models that propose design candidates, not just evaluate them
- Simulation earlier in the process and usable beyond specialists
What it delivers:
- Better-performing products at the same development cost
- Shorter time from concept to validated design
- Fewer expensive surprises late in development
Accelerated tooling and process design
This is about how the product is made—the molds, dies, fixtures, and process settings that turn a design into a manufactured part. Tooling geometry, materials, and process parameters together determine quality, cost, and yield.
What becomes possible:
- Thousands of tooling variants explored in the time a single simulation used to take
- Tooling geometry and process parameters optimized together, not in sequence
- Manufacturing defects predicted before any tool is cut
What it delivers:
- Faster tool development
- Higher yield and fewer scrapped parts
- Shorter ramp-up to stable production
Real-time digital twins
A digital twin is a virtual model of a physical asset—a turbine, a power grid, a battery, a chemical reactor—that mirrors its behavior.
What becomes possible:
- Continuous physics predictions on live sensor data
- Models that update in real time as the asset operates
- What-if scenarios on running assets, without taking them offline
What it delivers:
- Predictive maintenance before failures occur
- Higher operational efficiency across the asset’s lifetime
- Extended asset life and deferred capex on replacement
Where physics AI applies
Physics AI is a horizontal capability with vertical impact. This is a non-exhaustive map of where it is creating immediate value:
Aerospace: external aerodynamics, structural analysis, thermal management, propulsion, aeroelasticity.
Automotive: vehicle aerodynamics, crashworthiness, battery thermal management, motor design.
Electronics & semiconductors: chip and package thermal analysis, signal and power integrity, data-center and rack cooling, lithography optics.
Energy & utilities: wind and gas turbine design, grid-equipment optimizations, reactor thermal-hydraulics, subsurface flow and reservoir simulation.
Industrial equipment: heat exchangers, pumps and compressors, electric motors, tooling design.The same model class, retrained or fine-tuned on the relevant physics, transfers across these domains.
Part of an enterprise platform for the AI-native industrial engineering lifecycle
We believe that physics AI is most valuable when it composes with the rest of an engineering organization's AI stack. That is why we ship it as one capability inside Mistral's enterprise platform, alongside:
- Language and multimodal reasoning models
- Model training and customization pipelines
- AI workflow design, orchestration and monitoring tools
- Unified AI productivity and coding agent
- Private AI infrastructure
- And expert services to accelerate your AI-native transformation
We’re building the first fully integrated AI stack that rethinks traditional engineering workflows end-to-end: Engineers define the intent and verify outcomes - the stack executes in between. The result: manufacturers explore orders of magnitude more design candidates, build the next generation of products faster, and maintain continuous performance gains across operational assets at scale.
Get started
If you're building the next generation of aircraft, vehicles, energy systems or electronics—and you're tired of waiting on the solver—we'd like to hear from you.
We also opened new roles to build out our AI 4 Engineering team. Apply here!
Original source - May 22, 2026
- Date parsed from source:May 22, 2026
- First seen by Releasebot:May 28, 2026
Connect the dots: Build with built-in and custom MCPs in Studio
Mistral releases Connectors in Studio, bringing built-in and custom MCP connectors to API and SDK workflows for enterprise AI apps. The update adds direct tool calling and human-in-the-loop approvals, plus reusable connector management across conversations, agents, and workflows.
Today we are releasing Connectors in Studio to unblock developers building highly customised AI applications grounded in enterprise data.
All built-in connectors, as well as custom MCPs, are now available via API/SDK to be used with all model and agent calls.
We are also introducing direct tool calling, giving developers precise control over how and when tools are invoked, without authentication barriers getting in the way of testing and iterating. In addition, you can now implement human-in-the-loop approval flows, allowing secure review and confirmation before tool execution, ensuring both flexibility and governance.
- Programmatic access for creating, modifying, listing and deleting your connectors but also listing their tools and directly running them.
- All connectors are centrally registered making them available across Mistral apps: LeChat and AI Studio (with Vibe coming soon).
- Usage via Conversation API, Completions API, and Agent SDK can now facilitate complex workflows and integration with enterprise systems like CRMs, knowledgebases & productivity tools.
Integrations that live in the platform, not in your code
Building enterprise AI agents is getting easier. The harder part is everything around them: tracking down the right API docs, writing and maintaining tool functions, building integrations, setting up OAuth, handling token refresh, and debugging edge cases like broken pagination.
Because of this, teams keep rebuilding the same integration layer. Even within the same company, similar integrations are often implemented multiple times in arbitrary code, leading to security risks, lack of traffic observability, and duplication of work.
A connector solves this by packaging an integration into a single, reusable entity using the MCP protocol.
Once registered, the custom MCP connector is discoverable, governed & monitored in Studio and becomes a native tool for any conversation, agent, or workflow without rewriting integration logic, without re-implementing auth, without duplicating it across teams. Set up once, run it all the time, everywhere. Attaching a connector to any conversation takes one line:
A runnable golden path
Let’s build an agent for a multi-step workflow based on reasoning across sources given agent’s secure connectivity to GitHub, public repo content & docs, and live data from the web. The agent can understand intent, analyse code, and propose changes alongside other common use cases like generating tests, refactoring, identifying inefficiencies, bugs or vulnerabilities.
Prerequisites
pip install mistralai export MISTRAL_API_KEY="your-api-key" client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])1 - Create a connector for a public remote MCP
To query and explore code bases, we will leverage the DeepWiki remote server which provides an MCP interface to API/tool endpoint. This way the agent can explore the content and documentation without scraping docs manually or loading whole repos.
Registering the MCP server once allows users to reuse it across conversations, agents, or direct tool calls. This is the entry point for any custom MCP flow. For a comprehensive example of how to manage built-in and custom connectors see cookbook: Connectors Management.
2 - Create agent
The agent should also be able to connect to GitHub and the web; users don’t need to create those connectors as they are already built into Mistral.
Note that a connector can expose dozens of tools. If users want to exclude potentially damaging actions, tool_configuration controls the tool availability without modifying the connector itself. More details can be found in Cookbook: Using Connectors in Conversations.
Direct tool calling
Not every workflow needs the model to decide when and how tools are invoked. For a more deterministic experience, users can now call connectors directly.
This is especially useful for debugging and pipeline-style automation which limits ambiguity. For the full pattern, see cookbook: Connector tool calling.
When a human needs to be in the loop
Some actions should not execute without explicit approval. requires_confirmation pauses execution and hands control back to your application before the tool runs:
The model proposes, the user application decides whether to proceed. The boundary between AI judgment and human judgment is explicit and written in code. For the full approval flow, including the pending tool call and resume step, see cookbook: Human-in-the-loop Confirmation.
Start building
You can now use Connectors in Studio, in Public Preview. Start building today by visiting the Studio console: https://console.mistral.ai/build/connectors
- Documentation on the release
- Cookbooks on various common usage patterns
- May 27, 2026
- Date parsed from source:May 27, 2026
- First seen by Releasebot:May 28, 2026
May 27
Mistral launches Vibe, a unified agent with Work, Code, and Chat modes for productivity, coding, and legacy chat workflows.
We launched Vibe, our unified agent at chat.mistral.ai, available in three modes:
OTHER
- Work for productivity on web and mobile, with Skills, Workflows, Connectors, Libraries, and scheduled tasks.
- Code for developers, covering the Vibe CLI, the VS Code extension, and Vibe Code Web for remote coding sessions in a managed cloud sandbox.
- Chat preserves the legacy Le Chat experience for existing workflows.
Documentation restructured: previous /le-chat/* and /mistral-vibe/* paths now redirect to the new /vibe/* tree.
OTHER
Original source - May 2026
- No date parsed from source.
- First seen by Releasebot:May 1, 2026
Mistral Medium 3.5
Mistral releases Medium 3.5 and expands Le Chat with new Work mode and remote Vibe coding agents, bringing cloud-based async coding, multi-step task handling, and stronger agentic workflows to the platform.
Introducing Mistral Medium 3.5, remote coding agents in Vibe, plus new Work mode in Le Chat for complex tasks.
Coding agents have mostly lived on your laptop. Today we're moving them to the cloud, where they run on their own, in parallel, and notify you when they're done. You can start them from the Mistral Vibe CLI or directly in Le Chat, offloading a coding task without leaving the conversation.
Powering this is Mistral Medium 3.5 in public preview, our new default model in Mistral Vibe and Le Chat, built to run for long stretches on coding and productivity work. The new Work mode in Le Chat (Preview) extends this with a powerful agent for complex, multi-step tasks like research, analysis, and cross-tool actions.
Highlights.
- Mistral Medium 3.5, a new flagship model that merges instruction-following, reasoning, and coding into a single 128B dense model. Released as open weights, under a modified MIT license.
- Strong real-world performance at a size that runs self-hosted on as few as four GPUs.
- Mistral Vibe remote agents for async coding: sessions run in the cloud, can be spawned from the CLI or Le Chat, and a local CLI session can be teleported up to the cloud.
- Start Mistral Vibe coding tasks in Le Chat. Sessions run on the same remote runtime and keep going while you step away.
- Work mode in Le Chat runs on a new agent, powered by Mistral Medium 3.5, that works through multi-step tasks, calling tools in parallel until the job is done.
Mistral Medium 3.5.
Mistral Medium 3.5 is our first flagship merged model, available in public preview. It is a dense 128B model with a 256k context window, handling instruction-following, reasoning, and coding in a single set of weights. It performs strongly in real-world use, with self-hosting possible on as few as four GPUs. Reasoning effort is now configurable per request, so the same model can answer a quick chat reply or work through a complex agentic run. We trained the vision encoder from scratch to handle variable image sizes and aspect ratios.
Mistral Medium 3.5 scores 77.6% on SWE-Bench Verified, ahead of Devstral 2 and models like Qwen3.5 397B A17B. It also has strong agentic capabilities and scores 91.4 on τ³-Telecom.
The model was built for long-horizon tasks, calling multiple tools reliably, and producing structured output that downstream code can consume. It is the model that made async cloud agents in Vibe practical to ship.
Mistral Medium 3.5 becomes the default model in Le Chat. It also replaces Devstral 2 in our coding agent, Vibe CLI.
Vibe remote agents.
From today, coding sessions can work through long tasks while you’re away. Many can run in parallel, and you stop being the bottleneck on every step the agent takes.
You can start the cloud agents from the Mistral Vibe CLI or from Le Chat. While they run, you can inspect what the agent is doing, with file diffs, tool calls, progress states, and questions surfaced as you go. Ongoing local CLI sessions can be teleported up to the cloud when you want to leave them running, with session history, task state, and approvals carrying across.
Vibe sits between the systems engineering teams already use, with humans in the loop wherever they're needed. It plugs into GitHub for code and pull requests, Linear and Jira for issues, Sentry for incidents, and apps like Slack or Teams for reporting.
Each coding session runs in an isolated sandbox, including broad edits and installs. When the work is done, the agent can open a pull request on GitHub and notify you, so you review the result instead of every keystroke that produced it.
It fits the high-volume, well-defined work that takes a developer's time without taking their judgment: module refactors, test generation, dependency upgrades, CI investigations, as well as bug fixes.
We use Workflows orchestrated in Mistral Studio to bring Mistral Vibe into Le Chat. We originally built this for our own in-house coding environment, then for our enterprise customers. Today the capability opens up to everyone, who can now launch coding tasks from the web. And without being tied to a local terminal, a developer can run several in parallel.
You can start coding sessions directly in Le Chat, so a task described in chat runs on the same remote runtime as the CLI and the web, and comes back later as a finished branch or a draft PR.
New Work mode in Le Chat (Preview).
Work mode is a powerful new agentic mode for complex tasks in Le Chat, powered by a new harness and Mistral Medium 3.5. The agent becomes the execution backend for the assistant itself, so Le Chat can read and write, use several tools at once, and work through multi-step projects until it completes what you’ve asked.
Here’s what Work mode enables you to do today.
- Cross-tool workflows: catch up across email, messages, and calendar in a single run; prepare for a meeting with attendee context, latest news, and talking points pulled from your sources.
- Research and synthesis: dive into a topic across the web, internal docs, and connected tools, then produce a structured brief or report you can edit before exporting or sending.
- Triage your inbox and draft replies; create issues in Jira from your team and customer discussions; send a summary to your team on Slack.
Sessions persist longer than a typical chat reply, so an agent can keep going across many turns, through trial-and-error, and through to completion. In Work mode, connectors are on by default rather than chosen manually, which lets the agent reach into documents, mailboxes, calendars, and other systems for the rich context it needs to take correct action.
Every action the agent takes is visible: you see each tool call and the thinking rationale. Le Chat will ask for explicit approval—based on your permissions—before proceeding with sensitive tasks like sending a message, writing a document, or modifying data.
Get started.
Mistral Medium 3.5 is available today in Mistral Vibe and Le Chat, and powers remote coding agents and Work mode in Le Chat on the Pro, Team, and Enterprise plans.
Through API, it’s priced at $1.5 per million input tokens and $7.5 per million output tokens. Open weights are on Hugging Face under a modified MIT license.
It is also available for prototyping, hosted on NVIDIA GPU-accelerated endpoints on build.nvidia.com and as a scalable containerized inference microservice, NVIDIA NIM.
Build the future of agentic systems with us.
We're hiring across research, engineering, and product to push agentic systems further. See our open roles.
Original source - Apr 27, 2026
- Date parsed from source:Apr 27, 2026
- First seen by Releasebot:May 1, 2026
April 27
Mistral releases Mistral Medium 3.5, a frontier multimodal model for agentic and coding use cases with adjustable reasoning.
We released Mistral Medium 3.5 (mistral-medium-3.5), our frontier-class multimodal model optimized for agentic and coding use cases, with adjustable reasoning via the reasoning_effort parameter. Released as open weights under a Modified MIT license.
MODEL RELEASED
Original source - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 29, 2026
Workflows for work that runs the business
Mistral releases Workflows in public preview, bringing durable, observable AI orchestration to Studio and Le Chat. It helps teams run production processes in Python with human-in-the-loop approvals, traceable execution, and enterprise deployment flexibility across cloud, on-prem, or hybrid environments.
Workflows is now in public preview
Today, we're releasing Workflows in public preview. Workflows is the orchestration layer for enterprise AI. It brings the durability, observability, and fault tolerance required to move AI-powered processes from proof of concept to production reliably. Organizations like ASML, ABANCA, CMA-CGM, France Travail, La Banque Postale, Moeve, and many more are already running Workflows to automate critical processes.
Enterprise teams today have access to capable models. What they lack is a way to run them reliably in production. We see this across every industry we work with. The failure modes are consistent: pipelines that run in a notebook but fail silently in production with no trace, long-running processes that can't survive a network timeout, multi-step operations that need human approval mid-execution but have no mechanism to pause and resume, and systems that offer no way to verify they're still doing what they're supposed to after deployment.
Building all of the capabilities to address these challenges is months of complex work for enterprises: the orchestration layer has to be stitched together from scratch, and the components it connects, inference, agents, connectors, observability, each come from different tools with their own interfaces and formats.
Workflows is part of Studio, so the orchestration layer and the components it orchestrates are built to work together. Once a business process is identified, developers write the workflow in Python. Every workflow can then be published to Le Chat so anyone in the organisation can trigger it. Every step is tracked and auditable in Studio. By bringing all of this together, Workflows lets your organisation go from identifying a use case to running it in production in days.
Workflows deployed in the real world
As mentioned, Mistral AI customers are already using Workflows to automate business processes and run them in production. The examples below show how durability, observability, and human-in-the-loop approvals work in practice.
Cargo release automation
Global shipping runs on paperwork. A single cargo release can involve customs declarations, dangerous goods classifications, safety inspections, and regulatory checks across multiple jurisdictions. A missed step can result in cargo delays at port and potential compliance breaches.
The operational requirements for a use case like this are: the system must survive intermittent timeouts, pause mid-execution for human review, and produce a precise account of where and why when something fails.
Using Workflows, a customer is able to automate this end to end. The workflow validates every incoming shipping document against customs rules, checks for anomalies, flags anything that needs human sign-off, waits for approval, then releases the cargo. With Workflows, the human approval step is a single line of code: wait_for_input(). The workflow pauses, waits for as long as it takes with no compute consumption, notifies the reviewer, and resumes exactly where it left off. Studio records the full execution history.
Document compliance checking
KYC reviews are manual, repetitive, and time-consuming. A single customer onboarding can require extracting identity documents, verifying them against sanctions lists and PEP databases, cross-referencing regulatory requirements across jurisdictions, and producing a structured risk assessment with supporting evidence. Done manually, this takes hours of analyst time per case.
The operational requirements here are speed and auditability. A system to automate a process like this should be fast and should document the steps and reasoning behind them for meeting regulatory requirements.
With Workflows, the entire review process only takes minutes and Studio surfaces every step as a structured timeline you can drill into at any level of detail, down to specific traces with native support for OpenTelemetry.
Customer support triage
Support teams deal with volume. Refund requests, technical issues, billing disputes, account escalations. Routing them to the right team quickly and consistently is what determines resolution time.
The operational requirement here is correctability. Automated routing will get things wrong. When it does, the team needs to see why a ticket was routed the way it was, and fix it without retraining the model.
With Workflows, incoming tickets are analysed, categorised by intent and urgency, and routed to the right downstream process automatically. Each routing decision is visible and traceable in Studio. When the categorisation is wrong, the team corrects it at the workflow level.
Why Workflows
- Durable execution. Workflows track state at every step. If a process fails, it resumes where it left off. As a result, developers can focus more on writing business logic instead of recovery logic.
- Observability. Every branch, retry, and state change is recorded in Studio. If a decision needs to be investigated months later, the full timeline is there to show how it was reached.
- Human-in-the-loop. A single line of code pauses a workflow for approval. The reviewer responds from Le Chat, a webhook, or any connected surface, and the workflow picks up where it stopped.
- Native to Studio. Workflows use the same agents and connectors as the rest of Studio. There's no separate integration work to wire them in.
- Enterprise readiness. Workspaces within Studio keep teams and projects separated, and role-based access control (RBAC) makes sure those rules are enforced consistently.
- Built for developers and business teams. Engineers write workflows as code. Business teams run them from Le Chat.
- Deployment flexibility. The control plane runs on Mistral. Workers and data processing run in your environment, right where your critical services are hosted: cloud, on-prem, or hybrid.
Under the hood
Workflows is built on Temporal's durable execution engine, the same infrastructure that powers orchestration at Netflix, Stripe, and Salesforce. We extended it for AI-specific workloads by adding streaming, payload handling, multi-tenancy, and observability that the core engine does not provide out of the box.
The deployment model is split between Mistral and your environment, and separates the control plane from the data plane. Mistral hosts the orchestration infrastructure: Temporal cluster, the Workflows API, and Studio. You deploy workers on your own Kubernetes environment using a separate Helm chart, and they connect back to the central cluster via secure credentials. Your data and business logic stay within your perimeter.
The Mistral SDK handles retry policies, tracing, timeouts, rate limiting, and human-in-the-loop through decorators and single-line configuration, so the only thing you write is the business logic itself.
Get started
The Python SDK is how developers write and run workflows. v3.0 is now publicly available and installable with a single command:
Install Workflows
uv add mistralai-workflowsTry Workflows in Studio From scratch or using our demo templates.
Read the docs
Build your first Workflow in Studio
Talk to our team
Original source - Apr 29, 2026
- Date parsed from source:Apr 29, 2026
- First seen by Releasebot:Apr 29, 2026
v1.11.1: Patch for agentic use
Mistral Common patches user-after-tool handling and relaxes from_openai for smoother framework integrations.
What's Changed
This Patch allows usage of user message after tool message. It also makes from_openai less strict to make mistral-common integrations in other frameworks smoother.
- Fix docs by @juliendenize in #216
- Allow user message after tool by @juliendenize in #218
- Make from_openai methods lenient by silently dropping unsupported fields by @juliendenize in #217
- Version 1.11.1 by @juliendenize in #220
Full Changelog: v1.11.0...v1.11.1
Original source - Apr 29, 2026
- Date parsed from source:Apr 29, 2026
- First seen by Releasebot:Apr 29, 2026
v1.11.0: Mistral Guidance
Mistral Common adds Mistral Guidance for valid reasoning traces and better tool choice in 1.11.0.
What's Changed
Mistral Guidance is out !
Make use of lark grammar to guide your model in generating valid reasoning traces with or without tool calls !
- Improve tool choice by @juliendenize in #204
- Add Mistral guidance by @juliendenize in #202
- Simplify AGENTS.md by @juliendenize in #201
- Add version_num property by @juliendenize in #203
- Update version to 1.11.0 by @juliendenize in #206
Full Changelog: v1.10.0...v1.11.0
Original source - May 4, 2026
- Date parsed from source:May 4, 2026
- First seen by Releasebot:Apr 1, 2026
- Modified by Releasebot:May 5, 2026
v1.11.2
Mistral Common adds tag v1.11.2 for a public PyPI release.
Adds tag v1.11.2 for public PyPI release
Original source - March 2026
- No date parsed from source.
- First seen by Releasebot:Mar 27, 2026
Speaking of Voxtral
Mistral releases Voxtral TTS, its first multilingual text-to-speech model for natural, emotionally expressive voice generation. It supports 9 languages, low-latency streaming, custom voices, and testing in Mistral Studio, with API access now available.
Today we’re releasing Voxtral TTS, our first text-to-speech model with state-of-the-art performance in multilingual voice generation. The model is lightweight at 4B parameters, making Voxtral-powered agents natural, reliable, and cost-effective at scale.
Highlights
- Realistic, emotionally expressive speech in 9 popular languages with support for diverse dialects.
- Very low latency for time-to-first-audio.
- Easily adaptable to new voices.
- Available to test out in Mistral Studio.
- Enterprise-grade text-to-speech, powering critical voice agent workflows.
A natural voice generation hinges on the model’s ability to not only recite but interpret a text accurately. Contextual understanding - like neutral, happy, sarcastic, etc. - determines whether the listener considers the generation accurate or robotic. Our model excels at both contextual understanding and speaker modeling: capturing how a specific person naturally speaks. Our voice adaptation goes beyond traditional read-speech by capturing a speaker’s personality, including their natural pauses, rhythm, intonation, and emotional dexterity. With its compact size, low cost and latency, and easy adaptability, Voxtral TTS gives full control and customization for enterprises looking to own their voice AI stack.
Audio is the new UX. Create new interactions for collaboration and understanding only found in speech. Begin now in AI Studio with our Mistral Voices in American, British, and French dialects.
Listen and decide: can you tell the difference?
Our team speaks dozens of languages in multiple dialects, we understand the importance of cultural nuance and built a model that is a reflection of us. Speech generation builds trust via natural-like rhythm, emotion, and even the use of humor. That’s why with voice emulation, we focused on authenticity and emotional expressiveness.
State-of-the-art performance
Automated metrics such as word-error-rate and audio quality scores for multilingual text-to-speech systems are unable to measure naturalness of speech. What makes speech natural is extremely nuanced and requires a deep understanding of cultural differences and typical speaking patterns. Hence, comparative human evaluations performed by native speakers are crucial.
For voice agents, latency and quality are in constant tension. Human evaluations show that Voxtral TTS achieves superior naturalness compared to ElevenLabs Flash v2.5 while maintaining similar Time-to-First-Audio (TTFA). Voxtral also performs at parity with the quality of ElevenLabs v3, successfully supporting emotion-steering for more lifelike interactions.
We conducted a comparative human evaluation of Voxtral TTS and ElevenLabs v2.5 Flash in a zero-shot custom voice context. Using two recognizable voices in their native dialects for each of the 9 supported languages, 3 annotators performed a side-by-side preference test per pair on naturalness, accent adherence, and acoustic similarity to the original reference. Voxtral TTS widens the quality gap to v2.5 Flash in this zero-shot multilingual custom voice setting, highlighting the instant customizability of Voxtral TTS to any voice.
Spoken natively
Trained on a large speech dataset, Voxtral TTS is built for global application. It supports state-of-the-art performance in 9 languages: English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic.
The model was trained to adapt to a custom voice with a reference as little as 3s and capture not just the voice but also nuances like subtle accent, inflections, intonations and even disfluencies similar to those expressed in the reference. We offer some preset voice options in the API but it is simple to extend to your in-house voice library customizing it to the use-case, localize it to the language and accent, keep it neutral or more emotive, casual or formal, more natural and conversational or robotic.
The model also demonstrates zero-shot cross-lingual voice adaptation even though it’s not explicitly trained for it. For example, the model can generate English speech with a French voice prompt and English text. The resulting speech sounds natural while adopting the accent of the provided voice prompt (in this example, the generated speech has a natural French-accented English). This makes the model useful for building cascaded speech-to-speech translation systems.
Built for low-latency streaming
Latency is critical for voice agent applications. Voxtral TTS achieves a model latency of 70ms for a typical input voice sample of 10 seconds and 500 characters, with a real-time factor (RTF) of ≈9.7x. The model natively generates up to two minutes of audio, and our API handles arbitrarily long generations with smart interleaving.
Voxtral TTS architecture
The model is a transformer-based, autoregressive, flow-matching model, built on Ministral 3B. It consists of the following components:
- 3.4B parameters transformer decoder backbone
- 390M flow-matching acoustic transformer
- 300M neural audio codec (symmetric encoder-decoder)
The model takes a voice prompt (5 to 25 seconds) and a text prompt in 9 supported languages. For each audio frame, the transformer backbone predicts a semantic token, then the flow-matching transformer runs 16 function evaluations (NFEs) to produce the acoustic latent.
We developed an in-house codec, which processes audio causally using a semantic VQ (8192 vocabulary) and an acoustic FSQ (36 dim and 21 levels) latent and produces them at 12.5Hz frame rate.
Powering enterprise voice workflows
Voxtral TTS closes the loop on audio intelligence, giving enterprise voice pipelines an output layer that passes the human test. It works alongside Voxtral Transcribe for full speech-to-speech, or integrates into any existing speech-to-text and LLM stack, with cross-lingual support.
Workflows
Customer Support
Voice agents that route and resolve queries across channels with natural, brand-appropriate speech. Place Voxtral TTS into existing contact support call systems for automated spoken responses, with output that integrates into existing workflows.
Test-run the model in Mistral Studio
Experiment with Voxtral TTS directly in the Mistral Studio playground. Select one of the Mistral voices or record your own.
Get started with Voxtral TTS
Voxtral TTS is available now via API at $0.016 per 1k characters.
Try it now in Mistral Studio or in Le Chat.
A model with several reference voices is available as open weights on Hugging Face under CC BY NC 4.0 license.
Explore the model’s documentation.
Sign up for our upcoming webinar to learn more!We’re hiring!
We are building the voice layer for AI, and If this is the kind of problem you want to work on, we'd love to hear from you.
Original source - Mar 22, 2026
- Date parsed from source:Mar 22, 2026
- First seen by Releasebot:Mar 27, 2026
March 22
Mistral releases Voxtral TTS with zero-shot voice cloning, multilingual support, and real-time streaming.
We released Voxtral TTS (voxtral-tts-2603), our state-of-the-art text-to-speech model with zero-shot voice cloning, multilingual support, and real-time streaming.
MODEL RELEASED
Original source
Curated by the Releasebot team
Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.
Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.
Similar to Mistral with recent updates:
- Anthropic release notes614 release notes · Latest Jun 11, 2026
- Perplexity release notes25 release notes · Latest May 29, 2026
- xAI release notes84 release notes · Latest Jun 11, 2026
- OpenAI release notes743 release notes · Latest Jun 11, 2026
- Cursor release notes97 release notes · Latest Jun 11, 2026
- Windsurf release notes48 release notes · Latest Jun 10, 2026