MiniMax Release Notes

Last updated: Jan 16, 2026

Get this feed: RSS Email API Slack n8n Zapier

January 2026
- No date parsed from source.
- First seen by Releasebot:
  Jan 16, 2026
MinimMax by MiniMax

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

MiniMax now integrates with the Anthropic API ecosystem, letting developers plug in with an easy SDK setup and shared prompts. With supported models, streaming options, and clear config steps, you can deploy cross‑ecosystem AI quickly.
Install Anthropic SDK

pip install anthropic

Configure Environment Variables

For international users, use https://api.minimax.io/anthropic; for users in China, use https://api.minimaxi.com/anthropic

export ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic export ANTHROPIC_API_KEY=${YOUR_API_KEY}

Call API

Python example:

import anthropic client = anthropic.Anthropic() message = client.messages.create( model = "MiniMax-M2.1", max_tokens = 1000, system = "You are a helpful assistant.", messages = [ { "role": "user", "content": [ { "type": "text", "text": "Hi, how are you?" } ] } ] ) for block in message.content: if block.type == "thinking": print(f"Thinking:\n{block.thinking}\n") elif block.type == "text": print(f"Text:\n{block.text}\n")

Important Note

In multi-turn function call conversations, the complete model response (i.e., the assistant message) must be append to the conversation history to maintain the continuity of the reasoning chain.

Append the full response.content list to the message history (includes all content blocks: thinking/text/tool_use)

Supported Models

When using the Anthropic SDK, the MiniMax-M2.1, MiniMax-M2.1-lightning, MiniMax-M2 model is supported:
Model Name Description MiniMax-M2.1 Powerful Multi-Language Programming Capabilities with Comprehensively Enhanced Programming Experience (output speed approximately 60 tps) MiniMax-M2.1-lightning Faster and More Agile (output speed approximately 100 tps) MiniMax-M2 Agentic capabilities, Advanced reasoning
Note: The Anthropic API compatibility interface currently only supports the MiniMax-M2.1, MiniMax-M2.1-lightning, MiniMax-M2 model. For other models, please use the standard MiniMax API interface.

Compatibility

Supported Parameters

When using the Anthropic SDK, we support the following input parameters:
Parameter Support Status Description model Fully supported supports MiniMax-M2.1 MiniMax-M2.1-lightning MiniMax-M2 model messages Partial support Supports text and tool calls, no image/document input max_tokens Fully supported Maximum number of tokens to generate stream Fully supported Streaming response system Fully supported System prompt temperature Fully supported Range (0.0, 1.0], controls output randomness, recommended value: 1 tool_choice Fully supported Tool selection strategy tools Fully supported Tool definitions top_p Fully supported Nucleus sampling parameter metadata Fully Supported Metadata thinking Fully Supported Reasoning Content top_k Ignored This parameter will be ignored stop_sequences Ignored This parameter will be ignored service_tier Ignored This parameter will be ignored mcp_servers Ignored This parameter will be ignored context_management Ignored This parameter will be ignored container Ignored This parameter will be ignored

Messages Field Support

Field Type Support Status Description type="text" Fully supported Text messages type="tool_use" Fully supported Tool calls type="tool_result" Fully supported Tool call results type="thinking" Fully supported Reasoning Content type="image" Not supported Image input not supported yet type="document" Not supported Document input not supported yet

Examples

Streaming Response

Python example:

import anthropic client = anthropic.Anthropic() print("Starting stream response...\n") print("="*60) print("Thinking Process:") print("="*60) stream = client.messages.create( model = "MiniMax-M2.1", max_tokens = 1000, system = "You are a helpful assistant.", messages = [ { "role": "user", "content": [{"type": "text", "text": "Hi, how are you?"}] } ], stream = True, ) reasoning_buffer = "" text_buffer = "" for chunk in stream: if chunk.type == "content_block_start": if hasattr(chunk, "content_block") and chunk.content_block: if chunk.content_block.type == "text": print("\n" + "="*60) print("Response Content:") print("="*60) elif chunk.type == "content_block_delta": if hasattr(chunk, "delta") and chunk.delta: if chunk.delta.type == "thinking_delta": # Stream output thinking process new_thinking = chunk.delta.thinking if new_thinking: print(new_thinking, end = "", flush = True) reasoning_buffer += new_thinking elif chunk.delta.type == "text_delta": # Stream output text content new_text = chunk.delta.text if new_text: print(new_text, end = "", flush = True) text_buffer += new_text print("\n")

Important Notes

The Anthropic API compatibility interface currently only supports the MiniMax-M2.1, MiniMax-M2 model

The temperature parameter range is (0.0, 1.0], values outside this range will return an error

Some Anthropic parameters (such as thinking, top_k, stop_sequences, service_tier, mcp_servers, context_management, container) will be ignored

Image and document type inputs are not currently supported

Original source Report a problem
December 2025
- No date parsed from source.
- First seen by Releasebot:
  Dec 23, 2025
MinimMax by MiniMax

MiniMax-M2.1

MiniMax-M2.1 launches a polyglot text generation API with buildable tool calls, accessible via HTTP or SDKs. It supports ultra large context windows up to 204,800 tokens and emphasizes code understanding and interleaved tool use.
🎉 MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

The text generation API uses MiniMax M2.1 to generate conversational content and trigger tool calls based on the provided context.

It can be accessed via HTTP requests, the Anthropic SDK (Recommended), or the OpenAI SDK.

Supported Models
Model Name Context Window (total input + output per request) MiniMax-M2.1 MiniMax-M2.1-lightning 204,800 MiniMax-M2 204,800
Please note: The maximum token count refers to the total number of input and output tokens.

Recommended Reading

Compatible Anthropic API (Recommended): Use Anthropic SDK with MiniMax models

Compatible OpenAI API: Use OpenAI SDK with MiniMax models

M2.1 for AI Coding Tools: MiniMax-M2.1 excels at code understanding, dialogue, and reasoning.

M2.1 Tool Use & Interleaved Thinking: AI models can call external functions to extend their capabilities.

Original source Report a problem
All of your release notes in one place

Join Releasebot and get updates from MiniMax and hundreds of other software products.

Create account
Subscribe with: RSS Email API Slack n8n Zapier
December 2025
- No date parsed from source.
- First seen by Releasebot:
  Dec 23, 2025
MinimMax by MiniMax

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

MiniMax-M2.1 launches polyglot video generation from text or images with new models boosting realism and speed. The release outlines an asynchronous API flow to create, track, and download videos via task and file IDs.
🎉 MiniMax-M2.1: Polyglot programming mastery, precision code refactoring ➔

This API supports generating videos based on user-provided text, images (including first frame, last frame, or reference images).

Supported Models

MiniMax-Hailuo-2.3: New video generation model, breakthroughs in body movement, facial expressions, physical realism, and prompt adherence.

MiniMax-Hailuo-2.3-Fast: New Image-to-video model, for value and efficiency.

MiniMax-Hailuo-02: Video generation model supporting higher resolution (1080P), longer duration (10s), and stronger adherence to prompts.

API Usage Guide

Video generation is asynchronous and consists of three APIs: Create Video Generation Task, Query Video Generation Task Status, and File Management. Steps are as follows:

Use the Create Video Generation Task API: (Text to Video, Image to Video, Start / End to Video, Subject Reference to Video) to start a task. On success, it will return a task_id.

Use the Query Video Generation Task Status API with the task_id to check progress. When the status is success, a file ID (file_id) will be returned.

Use the Download the Video File API with the file_id from step 2 to view and download the generated video.

Official MCP

Visit the official MCP for more capabilities: https://github.com/MiniMax-AI/MiniMax-MCP
Original source Report a problem
December 2025
- No date parsed from source.
- First seen by Releasebot:
  Dec 23, 2025
MinimMax by MiniMax

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

New Image Generation service introduces Text-to-Image and Image-to-Image capabilities. Generate images from detailed prompts or from reference images to preserve subject characteristics and maintain visual identity across contexts.

The Image Generation service provides two core capabilities: Text-to-Image and Image-to-Image.

Generate Images from Text

Create images directly from detailed text descriptions (prompts) that specify the desired content.

Generate Images with Reference Images

This feature allows you to supply one or more reference images (including online image URLs) that contain a clear subject. Combined with a text prompt, the service generates a new image that preserves the subject’s key characteristics.
This is particularly useful for scenarios that require consistent visual identity, such as generating images of the same virtual character in different contexts.
Original source Report a problem

December 2025

No date parsed from source.
First seen by Releasebot:
Dec 23, 2025

MinimMax by MiniMax

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

Music Generation API now lets you generate full songs with vocals from text prompts and lyrics. Define style, mood, tempo, and vocal traits to craft ready-to-use tracks for videos, games, or apps. Aimed at quick, theme-driven music creation.

The Music Generation API

The Music Generation API can create a complete song with vocals based on a text description and lyrics.

Use the prompt parameter to define the music’s style, mood, and scenario, and the lyrics parameter to provide the vocal content.
This feature is ideal for quickly generating unique theme songs for videos, games, or applications.

Example: Text-to-Music Creation

import requests
import os

url = "https://api.minimax.io/v1/music_generation"
api_key = os.environ["MINIMAX_API_KEY"]
headers = {
  "Authorization": f"Bearer {api_key}"
}

payload = {
  "model": "music-2.0",
  "prompt": "This is a contemporary R&B/Pop track with distinct Trap influences, radiating a confident, assertive, and empowered energy. It features a bright, clear, and agile female vocal with a polished and heavily processed modern sound. The singer's rhythmic and confident delivery is defined by the heavy and stylistic use of Auto-Tune, creating its signature character. Extensive backing vocals, including layered harmonies and ad-libs built upon stacked unison vocals, produce a rich and full texture, enhanced by moderate reverb for a spacious feel. Set at a tempo of 80 BPM, the arrangement is driven by a dominant 808 bassline and electronic drums with intricate hi-hat patterns and sharp claps, while atmospheric synth pads and subtle sound effects craft a dynamic backdrop. This track is perfect for clubbing, parties, driving with the windows down, or a workout session, making it an essential addition to any confidence-boosting playlist.",
  "lyrics": "[chorus]\nSummit, i reached the summit\nI'm the peak with the fire, they all want from it\nSpill a bit of my glow, like a comet\nI ain't worried 'bout hills, you just plummet\nSummit, i reached the summit\nObsidian shards 'round my throat, now they run from it\nAin't no wonder why the valleys all run from it\nI'm awake, from the summit\n[verse]\nI know what i hold\nAnd i'm about to erupt, yeah\nA story untold, yeah\nI know you won't interrupt it\nKeep your eyes on the rise, no surprise that i'm bright\nGot one stream for the sea, other stream for the night\nI be flowin', you're erodin'\nSwear you're slowin', i'm explodin'\nPressure's growin', growin', growin'\n[interlude]\nSummit, i reached the summit\nI'm the peak with the fire, they all want from it\nSpill a bit of my glow, like a comet\nI ain't worried 'bout stone\n[verse]\nI ain't worried 'bout nada\nUnless it's new earth, unless it's magma\nUnless it's deep core, a new nirvana\nUnless it's shaping a new savanna\nI wanna feel like i'm mother gaia\nI wanna feel like i'm way up\nRumbling, grumbling 'til the world pay up\nMade another island, no layups\nStay hot every single day i wake up\n[chorus]\nSummit, i reached the summit\nI'm the peak with the fire, they all want from it\nSpill a bit of my glow, like a comet\nI ain't worried 'bout hills, you just plummet\nSummit, i reached the summit\nObsidian shards 'round my throat, now they run from it\nAin't no wonder why the valleys all run from it\nI'm awake, from the summit\n[outro]\nSummit\nRooo-ar",
  "audio_setting": {
    "sample_rate": 44100,
    "bitrate": 256000,
    "format": "mp3"
  }
}

response = requests.post(url, headers = headers, json = payload)
response.raise_for_status()
audio_hex = response.json()["data"]["audio"]

with open("output.mp3", "wb") as f:
    f.write(bytes.fromhex(audio_hex))

Original source Report a problem

Dec 23, 2025
- Date parsed from source:
  Dec 23, 2025
- First seen by Releasebot:
  Dec 23, 2025
MinimMax by MiniMax

MiniMax M2.1: Significantly Enhanced Multi-Language Programming, Built for Real-World Complex Tasks

MiniMax M2.1 unleashes AI-native development with stronger multi-language coding, improved office task automation, and enhanced mobile Web/App capabilities. It promises faster, cheaper, more capable AI workflows and opens the model to open-source deployment and public tools.
MiniMax M2.1 Release

MiniMax has been continuously transforming itself in a more AI-native way. The core driving forces of this process are models, Agent scaffolding, and organization. Throughout the exploration process, we have gained increasingly deeper understanding of these three aspects. Today we are releasing updates to the model component, namely MiniMax M2.1, hoping to help more enterprises and individuals find more AI-native ways of working (and living) sooner.

In M2, we primarily addressed issues of model cost and model accessibility. In M2.1, we are committed to improving performance in real-world complex tasks: focusing particularly on usability across more programming languages and office scenarios, and achieving the best level in this domain.

Key Highlights of MiniMax M2.1:

Exceptional Multi-Programming Language Capabilities
Many models in the past primarily focused on Python optimization, but real-world systems are often the result of multi-language collaboration.
In M2.1, we have systematically enhanced capabilities in Rust, Java, Golang, C++, Kotlin, Objective-C, TypeScript, JavaScript, and other languages. The overall performance on multi-language tasks has reached industry-leading levels, covering the complete chain from low-level system development to application layer development.

WebDev and AppDev: A Comprehensive Leap in Capability and Aesthetics
Addressing the widely recognized weakness in mobile development across the industry, M2.1 significantly strengthens native Android and iOS development capabilities.
Meanwhile, we have systematically enhanced the model's design comprehension and aesthetic expression in Web and App scenarios, enabling excellent construction of complex interactions, 3D scientific scene simulations, and high-quality visualization, making vibe coding a sustainable and deliverable production practice.

Enhanced Composite Instruction Constraints, Enabling Office Scenarios
As one of the first open-source model series to systematically introduce Interleaved Thinking, M2.1's systematic problem-solving capabilities have been further upgraded. The model not only focuses on code execution correctness but also emphasizes integrated execution of "composite instruction constraints," providing higher usability in real office scenarios.

More Concise and Efficient Responses
Compared to M2, MiniMax-M2.1 delivers more concise model responses and thought chains. In practical programming and interaction experiences, response speed has significantly improved and token consumption has notably decreased, resulting in smoother and more efficient performance in AI Coding and Agent-driven continuous workflows.

Outstanding Agent/Tool Scaffolding Generalization Capabilities
M2.1 demonstrates excellent performance across various programming tools and Agent frameworks. It exhibits consistent and stable results in tools such as Claude Code, Droid (Factory AI), Cline, Kilo Code, Roo Code, and BlackBox, while providing reliable support for Context Management mechanisms including Skill.md, Claude.md/agent.md/cursorrule, and Slash Commands.

High-Quality Dialogue and Writing
M2.1 is no longer just "stronger in coding capabilities." In everyday conversation, technical documentation, and writing scenarios, it also provides more detailed and structured responses.

First Impressions

"We're excited for powerful open-source models like M2.1 that bring frontier performance (and in some cases exceed the frontier) for a wide variety of software development tasks. Developers deserve choice, and M2.1 provides that much needed choice!"

Eno Reyes, Co-Founder, CTO of Factory AI

“MiniMax M2.1 performed exceptionally well across our internal benchmarks, showing strong results in complex instruction following, reranking, and classification, especially within e-commerce tasks. Beyond its general versatility, it has proven to be an excellent model for coding. We are impressed by these results and look forward to a close collaboration with the MiniMax team as we continue to support their latest innovations on the Fireworks platform.”

Benny Chen, Co-founder of Fireworks

“Minimax M2 series has demonstrated powerful code generation capability, and has quickly became one of the most popular model on Cline platform during the past few months. We already see another huge advancement in capability for M2.1 and very excited to continue partner with minimax team to advance AI in coding”

Saoud Rizwan, Founder, CEO of Cline

“We could not be more excited about M2.1! Our users have come to rely on MiniMax for frontier-grade coding assistance at a fraction of the cost, and early testing shows M2.1 excelling at everything from architecture and orchestration to code reviews and deployment. The speed and efficiency are off the charts!”

Scott Breitenother, Co-Founder, CEO of Kilo

"Our users love MiniMax M2 for its strong coding ability and efficiency. The latest M2.1 release builds on that foundation with meaningful improvements in speed and reliability, performing well across a wider range of languages and frameworks. It's a great choice for high-throughput, agentic coding workflows where speed and affordability matter."

Matt Rubens, Co-Founder, CEO of RooCode

“Integrating the MiniMax M2 series into our platform has been a significant win for our users, and M2.1 represents a clear step forward in what a coding-specific model can achieve. We’ve found that M2.1 handles the nuances of complex, multi-step programming tasks with a level of consistency that is rare in this space. By providing high-quality reasoning and context awareness at scale, MiniMax has become a core component of how we help developers solve challenging problems faster. We look forward to seeing how our community continues to leverage these updated capabilities.”

Robert Rizk, Co-Founder, CEO of BlackBox

Benchmarks

MiniMax-M2.1 delivers a significant leap over M2 on core software engineering leaderboards. It shines particularly bright in multilingual scenarios, where it outperforms Claude Sonnet 4.5 and closely approaches Claude Opus 4.5.

We also evaluated MiniMax-M2.1 on SWE-bench Verified across a variety of coding agent frameworks. The results highlight the model's exceptional framework generalization and robust stability.
Furthermore, across specific benchmarks—including test case generation, code performance optimization, code review, and instruction following—MiniMax-M2.1 demonstrates comprehensive improvements over M2. In these specialized domains, it consistently matches or exceeds the performance of Claude Sonnet 4.5.

To evaluate the model's full-stack capability to architect complete, functional applications "from zero to one," we established a novel benchmark: VIBE (Visual & Interactive Benchmark for Execution). This suite encompasses five core subsets: Web, Simulation, Android, iOS, and Backend. Distinguishing itself from traditional benchmarks, VIBE leverages an innovative Agent-as-a-Verifier (AaaV) paradigm to automatically assess the interactive logic and visual aesthetics of generated applications within a real runtime environment.
MiniMax-M2.1 delivers outstanding performance on the VIBE aggregate benchmark, achieving an average score of 88.6—demonstrating robust full-stack development capabilities. It excels particularly in the VIBE-Web (91.5) and VIBE-Android (89.7) subsets.
MiniMax-M2.1 also demonstrates steady improvements over M2 in both long-horizon tool use and comprehensive intelligence metrics.

Showcases

Multilingual Coding

3D Interactive Animation
MiniMax M2.1 built a "3D Dreamy Christmas Tree" based on React Three Fiber and InstancedMesh, successfully rendering over 7,000 instances. It supports gesture interaction and complex particle animation, demonstrating advanced 3D rendering capabilities.
Try it out: https://yuyl27wq92.space.minimax.io/

Avant-Garde Web UI Design
M2.1 generated a minimalist photographer's personal homepage using an asymmetrical layout and a black-white-red contrasting color scheme. By combining immersive imagery with brutalist typography, it achieved a high-impact visual effect.
Try it out: https://m6xkaf07udss.space.minimax.io/

Website - Skincare Brand
M2.1 designed a landing page for a high-end organic skincare brand. Adopting a "Clean & Minimalist" style, it accurately presented the brand's premium identity and international visual appeal.
Try it out: https://2drpfocv00n9.space.minimax.io/

Web 3D Lego Sandbox
M2.1 developed a high-freedom 3D brick building application based on Three.js, implementing precise grid snapping algorithms and collision detection mechanisms. The project perfectly replicates the glossy texture of plastic bricks, supporting multi-angle rotation, drag-and-drop assembly, and instant color switching, providing users with an immersive 3D creative building experience.
Try it out: https://8e6nunemyuzh.space.minimax.io/

Native App Development - Android
M2.1 used Kotlin to develop a native Android gravity sensor simulator. Utilizing the gyroscope for a silky-smooth control experience, it features clever visual easter eggs that elegantly present the "MERRY XMAS MiniMax M2.1" message through natural UI transitions and collision effects.

Native App Development - iOS
M2.1 wrote an interactive iOS Home Screen widget, designing a "Sleeping Santa" click-to-wake mechanism. The logic is complete with native-level animation effects—Santa lives in your widget; tap him ten times to wake him up for a surprise! 🎅🎁

Web Audio Simulation Development
M2.1 developed a 16-step drum machine simulator based on the Web Audio API. It integrates synthesized drum sounds, non-linear rhythm algorithms, and real-time glitch sound effects, providing an avant-garde electronic music experience! (Turn on the sound in the video below to listen!)
Try it out: https://21okxwno2u.space.minimax.io

Rust TUI
M2.1 built a powerful Linux security audit tool with dual CLI + TUI modes using Rust, supporting one-click low-level scanning and intelligent risk rating for critical items such as processes, networks, and SSH.

Python Data Dashboard
M2.1 created a Web3 cryptocurrency price dashboard in the style of The Matrix. Use Python (backend for real-time price API fetching), HTML (structure), and CSS (Matrix aesthetic: green digital rain on black background, monospaced font, glowing neon green text, terminal-like UI).

C++ Image Rendering
M2.1 utilized C++ and GLSL to implement complex light transport algorithms, accurately rendering the physical refraction of a crystal ball, detailed SDF modeling of a snowman, and shimmering snow effects in a real-time environment.

Java Real-time Danmaku
M2.1 implemented a high-performance real-time Danmaku (bullet chat) system based on Java, a clean and intuitive user interface, and millisecond-level response capabilities.

SVG Generation
M2.1 generated an interactive isometric SVG island map, constructing a detailed miniature world that supports one-click zooming to freely explore four major themed areas.
Try it out: https://08tmc3aada59.space.minimax.io/

Agentic Tool Use
Tool Use Capability: Excel Market Research
M2.1 demonstrated its tool-use capabilities by autonomously invoking Excel and Yahoo Finance to complete an end-to-end task, ranging from market research data cleaning and analysis to chart generation.

Digital Employee
The "Digital Employee" is a key feature of the MiniMax M2.1 model. M2.1 accepts web content presented in text form and controls mouse clicks and keyboard inputs via text-based commands. It can complete end-to-end tasks in daily office scenarios across administration, data science, finance, human resources, and software development. The following demo video is a screen recording of M2.1's behavioral trajectory in the Agent Company Benchmark.

End-to-End Office Automation
Demo 1: Administrative tasks
Task Requirements: Proactively collect employees' equipment requests on communication software, then search for relevant documents on the enterprise's internal server to obtain equipment prices, calculate the total cost and determine whether the department budget is sufficient, and then record equipment changes.

Demo 2: Project management tasks
Task Requirements: Search for blocked or backlogged issues on the project management software, then find relevant employees on the communication software and consult them for solutions, and update the status of the issues based on the employees' feedback.

Demo 3: Software development tasks
Task Requirements: A colleague wants to know which is the most recent Merge Request that modified a certain file. Search for the relevant Merge Request, find its number, and inform the colleague.

How to Use

The MiniMax-M2.1 API is now live on the MiniMax Open Platform: https://platform.minimax.io/docs/guides/text-generation

Our product MiniMax Agent, built on MiniMax-M2.1, is now publicly available: https://agent.minimax.io/

The MiniMax-M2.1 model weights are now open-source, allowing for local deployment and use: https://huggingface.co/MiniMaxAI/MiniMax-M2.1

Local Deployment Guide

Download the model from HuggingFace repository
We recommend using the following inference frameworks (listed alphabetically) to serve the model:

SGLang
We recommend using SGLang to serve MiniMax-M2.1. Please refer to our SGLang Deployment Guide.

vLLM
We recommend using vLLM to serve MiniMax-M2.1. Please refer to our vLLM Deployment Guide.

Other Inference Engines

MLX

KTransformers

Inference Parameters

We recommend using the following parameters for best performance:
temperature=1.0, top_p = 0.95, top_k = 40

Tool Calling Guide

Please refer to our Tool Calling Guide.

Contact Us

Contact us at [email protected]

Business Cooperation: [email protected]

MiniMax X: https://x.com/MiniMax__AI

MiniMax LinkedIn: https://www.linkedin.com/company/81521159

MiniMax Discord: https://discord.gg/minimax

Original source Report a problem
Oct 31, 2025
- Date parsed from source:
  Oct 31, 2025
- First seen by Releasebot:
  Dec 23, 2025
MinimMax by MiniMax

MiniMax Music 2.0

MiniMax Music 2.0 launches with dynamic vocals, precise instrument control, and professional-grade audio. Create complete songs up to five minutes, perform versatile singing styles and duets, and turn prompts into polished, film-grade soundscapes.

Today, we are officially launching our latest-generation music model—MiniMax Music 2.0. This version represents a true leap forward in the model's understanding and expression of music. It can accurately capture and reproduce everything from the subtle emotions of the human voice to the dynamic tension of musical instruments.

It understands rhythm and emotion, weaving together vocals and instruments to become the ultimate "singing producer."

From now on, expressing yourself through music is no longer a privilege for the few, but a joy accessible to everyone.

Turn your inspiration into flowing melodies. Feel the rhythm, let the music belong to you.

1. Dynamic Vocals with Mastery Over Diverse Singing Styles

You don't need professional vocal training to sing the melody in your heart with the voice, technique, and style you desire.

In terms of vocal texture, Music 2.0 produces a timbre that is incredibly close to the real human voice. The model performs like a seasoned "vocal powerhouse," capable of mastering a wide range of singing techniques and emotional styles. Its nuanced handling of phrasing, rhythm, and breath demonstrates a "musical intuition" comparable to a professional singer.

The model supports precise control over vocal timbre. Using prompts, you can maintain a core vocal identity while switching between different singing styles, allowing one voice to have a thousand variations. The AI can transform into a "versatile vocal artist."

The same female voice can switch effortlessly between Jump Blues, Rock, and Electronic styles.

Beyond popular genres like Pop, Jazz, Blues, Rock, and Folk, the model also supports male-female duets, a cappella, and more.

Achieve a dynamic duet with a conversational feel and varied intensity through seamless transitions between male and female lead vocals.

Create rich melodies even without instrumental accompaniment.

2. Catchy Melodies and Precise Instrument Control

You don't have to be a music arranger to compose your own complete musical piece.

Building on the strengths of its predecessor, Music 2.0 generates structurally complete songs with clear logic, including verses, choruses, and bridges, with a potential length of up to five minutes. Furthermore, the new model creates melodies that are more memorable and instantly captivating.

The hook's melody is easy to remember, mirroring the melodic habits of human composers.

The model can follow specific instructions to independently control and adjust various instruments in the accompaniment, creating layered, rich arrangements with a natural groove across different styles.

Experience a live masterclass in jazz as the saxophone, trombone, trumpet, jazz drums, and piano enter in perfect sequence.

3. Professional-Grade Audio Experience

The new model delivers a comprehensive upgrade in audio quality. Both the texture of the vocal tracks and the spatial presence of the instruments are enhanced, providing you with an immersive listening experience.

Step into a retro disco with vibrant vocal performances and classic 80s instrumentation that will transport you back to the golden age of dance.

One More Thing

While testing Music 2.0, we made a surprising discovery: you can use prompts to describe vocal emotions and soundscapes with precision to generate film-grade monologue soundtracks. The layered emotional progression and musical development create a vivid picture you can "hear."

This exciting capability stems from the model's accurate semantic understanding combined with its precise control over vocal expression—a perfect fusion that gives sound a versatile emotional contour.

Music 2.0 is now live.
Start creating and discover your own sound:
https://www.minimax.io/audio/music

Intelligence with Everyone.
Original source Report a problem
Oct 30, 2025
- Date parsed from source:
  Oct 30, 2025
- First seen by Releasebot:
  Dec 23, 2025
MinimMax by MiniMax

MiniMax Speech 2.6: The Ultimate Voice Agent Has Arrived

MiniMax Speech 2.6 launches with under 250 ms end-to-end latency, smarter handling of non standard formats, and Fluent LoRA for natural, multi language voices. It enables faster, more fluid voice interactions across real world platforms and devices.
MiniMax Speech 2.6 Release Notes

Today, we’re thrilled to introduce MiniMax Speech 2.6 — our latest speech model, bringing comprehensive upgrades with ultra-low latency, enhanced format handling, and a more natural, human-like voice for Voice Agent scenarios.

Since its launch, MiniMax Speech has become a core piece of infrastructure in the global voice intelligence landscape, known for its outstanding speech technology and exceptional cost-effectiveness.

From LiveKit, which powers ChatGPT's advanced voice mode, and the popular open-source framework Pipecat on GitHub, to the YC-incubated voice platform Vapi, all have chosen MiniMax Speech as their underlying technology engine. In the smart hardware sector, innovative products like Haivivi Bubble Pal, Fuzozo, and Rokid Glasses are also powered by MiniMax Speech to deliver their natural voice interaction experiences.

MiniMax continues to drive new forms of productivity through technological innovation, breaking down the barriers of language and culture to deliver natural, fluent interactions that connect every voice around the world.

Ultra-Low Latency, More Responsive: For Smoother Overall Interaction

We have completely optimized the audio generation pipeline, achieving an end-to-end latency of under 250 milliseconds—a top-tier industry standard. In scenarios with strict response time requirements, such as real-time conversations, audio generation is no longer the bottleneck, ensuring a smoother overall interaction.

Listen to Speech 2.6 acting as an AI customer service agent:

Seamless Handling of Specialized Formats, Smarter: For More Fluid Information Delivery

Speech 2.6 now directly converts non-standard text formats in multiple languages, including URLs, email addresses, phone numbers, dates, and monetary amounts. Whether you are using it with a large language model or need to process dynamically changing entity information in your business, you no longer need to perform tedious text pre-processing. The input is read correctly from the start, enabling more fluid information delivery.

For example, to correctly read the following passage, traditional TTS would require a series of conversions:

+1 415 415 9921 → “plus one, four one five, four one five, nine nine two one”

$1,234.56 → “one thousand two hundred thirty-four dollars and fifty-six cents”

192.168.1.1 → “one nine two dot one six eight dot one dot one”

2032-5-6 → “May sixth, twenty thirty-two”

[email protected] → “support dash vip at technet dot com”

Original Text:
"Hello Oliver Smith, I'm your intelligent virtual assistant Max! Thank you for your call. I've found your file. The outstanding balance for the phone number +1 415 415 9921 is $1,234.56. The associated IP address is 192.168.1.1. Your next payment is due on 2032-5-6. If you have any questions, please contact [email protected]."

Greater Naturalness and Fluent LoRA: For More Fluent Vocal Expression

In addition to further enhancing prosodic naturalness, Speech 2.6 also introduces Fluent LoRA.

Speech 2.5 already offered a convenient, high-fidelity voice cloning feature that allowed users to preserve the unique characteristics of the original voice, such as accents and speech habits. This capability met the diverse voice needs of real-world application scenarios.

Now, you no longer have to worry about imperfect source material when cloning a voice. Even with non-native recordings that may have an accent or be disfluent, Fluent LoRA can perfectly replicate the voice's timbre while generating fluent, natural speech that matches the target text, making your vocal expression more articulate.

Besides the English example shown in the video, this feature enables one-click fluency for voice cloning across the 40+ languages the model supports. Here is an example in a Japanese scenario:

Speech 2.6 is now fully live. Welcome to try it out:

MiniMax Open Platform:
https://www.minimax.io/platform_overview

MiniMax Audio:
https://www.minimax.io/audio

Intelligence with Everyone.
Original source Report a problem
Oct 28, 2025
- Date parsed from source:
  Oct 28, 2025
- First seen by Releasebot:
  Feb 3, 2026
MinimMax by MiniMax

Oct. 28, 2025
Video Generation API

Added two new models — MiniMax-Hailuo-2.3 and MiniMax-Hailuo-2.3-Fast

MiniMax-Hailuo-2.3 supports both Text-to-Video (T2V) and Image-to-Video (I2V) generation modes

MiniMax-Hailuo-2.3-Fast supports Image-to-Video (I2V) generation mode

Both models support 768P (6s, 10s) and 1080P (6s) resolutions

Original source Report a problem
Oct 28, 2025
- Date parsed from source:
  Oct 28, 2025
- First seen by Releasebot:
  Dec 23, 2025
MinimMax by MiniMax

MiniMax Hailuo 2.3: A New Level of Complex Video Performance & Media Agent

Hailuo 2.3 launches the MiniMax video model with improved physics, lifelike expressions, and richer stylization, plus a faster cheaper Fast variant. The Media Agent now enables one-click multi-modal video creation with optional step-by-step editing and global rollout.

Hailuo 2.3 MiniMax Video Model Release

Today, we are excited to introduce the MiniMax video model, Hailuo 2.3. Building upon the Hailuo 02 model, it further enhances dynamic expression, resulting in more realistic and stable visuals. The Hailuo 2.3 model achieves significant improvements in the portrayal of physical actions, stylization, and character micro-expressions, while further optimizing its response to motion commands.

First, thanks to the model's enhanced understanding of physics and command following, Hailuo 2.3 can render more complex character body movements with greater fluidity, naturalness, precision, and control. Even with dynamic camera movements, it achieves near-photorealistic visual effects in lighting direction, shadow transitions, and color tones.

In terms of stylization, Hailuo 2.3 offers better support for anime, illustration, as well as special art styles like ink wash painting and game CG. Users who love anime creation adored the "Live" model in Hailuo 01, and now Hailuo 2.3 unlocks an even wider range of art styles, delivering more stable and vivid outputs from the general model.

In Hailuo 2.3, live-action facial performances and micro-expression changes are also more natural. We use subtle expression changes to craft the most captivating character performances.

In addition to improvements in human expressions and actions, Hailuo 2.3 also shows an enhanced response to motion commands for objects. With the "Double 11" shopping festival underway, some creators in our beta test produced e-commerce ads and saw a significant increase in their success rate for generating high-quality content.

Hailuo 2.3 once again sets a new global record for video model cost-efficiency. It boosts performance while maintaining the same pricing as Hailuo 02, offering "more for the same price" to both business and consumer users and providing the best value in the industry for creators worldwide. Furthermore, we are offering the Hailuo 2.3 Fast model, which generates videos faster at a lower price, reducing costs for batch creation by up to 50%.

We have fully rolled out these model updates across the Hailuo AI website, mobile app, and Open Platform API. We are also offering daily free trial credits during the launch period for you to experience. As we continue to iterate on the model's overall capabilities, we will also focus on deep optimization for different AI video application scenarios to solve the real-world problems our users face.

Media Agent Evolution

This summer, we released the Hailuo Video Agent to a positive reception. Through the usage and feedback from Hailuo creators, we've realized that multi-modal fusion creation is undoubtedly the future. Today, the Hailuo Video Agent officially evolves into the Media Agent, supporting comprehensive multi-modal creation, and it has been launched simultaneously worldwide.

Simply input the content you want, and the Media Agent automatically matches the right multi-modal models. With no manual editing required, the "one-click video generation" feature handles everything for you. Professional creators can also use the Media Agent for step-by-step creation, freely uploading images, videos, or audio to customize the final product according to their needs.

For example, we tried designing a 30-second ad for the "Casa Nacho" brand of tortilla chips. We simply input the desired scene, color tone, camera style, and music, and here's the result from the one-click generation feature.

In future updates to the Media Agent, we will be able to adjust the details of any part of the creation pipeline with the Agent on a canvas, truly achieving "creation through conversation" while preserving every idea. We believe that interacting and co-creating with AI using natural language is what the next-generation creative platform should be.

We are entering an era of rapid change, one where AI video is transforming how many people work and create. We hope that Hailuo can be an all-powerful creative assistant and a pioneer of innovation and change, allowing inspiration to take shape—and then transcend all forms.

Experience Hailuo AI at:
https://hailuoai.video/
Experience Media Agent at:
https://hailuoai.video/agent
Original source Report a problem

MiniMax Release Notes

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

MiniMax-M2.1

🎉 MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

Supported Models

Recommended Reading

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

🎉 MiniMax-M2.1: Polyglot programming mastery, precision code refactoring ➔

Supported Models

API Usage Guide

Official MCP

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

The Image Generation service provides two core capabilities: Text-to-Image and Image-to-Image.

Generate Images from Text

Generate Images with Reference Images

MiniMax-M2.1: Polyglot programming mastery, precision code refactoring

The Music Generation API

Example: Text-to-Music Creation

MiniMax M2.1: Significantly Enhanced Multi-Language Programming, Built for Real-World Complex Tasks

MiniMax M2.1 Release

Key Highlights of MiniMax M2.1:

First Impressions

Benchmarks

Showcases

How to Use

Local Deployment Guide

Inference Parameters

Tool Calling Guide

Contact Us

MiniMax Music 2.0

1. Dynamic Vocals with Mastery Over Diverse Singing Styles

2. Catchy Melodies and Precise Instrument Control

3. Professional-Grade Audio Experience

One More Thing

MiniMax Speech 2.6: The Ultimate Voice Agent Has Arrived

MiniMax Speech 2.6 Release Notes

Ultra-Low Latency, More Responsive: For Smoother Overall Interaction

Seamless Handling of Specialized Formats, Smarter: For More Fluid Information Delivery

Greater Naturalness and Fluent LoRA: For More Fluent Vocal Expression

Oct. 28, 2025

Video Generation API

MiniMax Hailuo 2.3: A New Level of Complex Video Performance & Media Agent

Hailuo 2.3 MiniMax Video Model Release

Media Agent Evolution

Related vendors