- Mar 20, 2025
- Parsed from source:Mar 20, 2025
- Detected by Releasebot:Oct 26, 2025
v1.6.0: Mistrall goes Small 3.1 with vision
What's Changed
- Missing new line by @theophilegervet in #234
- Add support to Mistral Small 3.1 by @juliendenize in #239
- Remove file refs by @juliendenize in #240
- Release 1.6.0 by @juliendenize in #241
New Contributors
- @theophilegervet made their first contribution in #234
- @juliendenize made their first contribution in #239
Full Changelog: v1.5.0...v1.6.0
Original source Report a problem - Sep 13, 2024
- Parsed from source:Sep 13, 2024
- Detected by Releasebot:Oct 26, 2025
- Modified by Releasebot:Dec 28, 2025
v1.4.0: Pixtral 👀
Pixtral Mistral models are now accessible via an upgradeable mistral_inference install with a HuggingFace download flow. The guide includes CLI and Python usage plus image input prompts to demo multi modal capabilities. A release enabling Pixtral 12B-2409 model usage.
Pixtral
Mistral models can now 👀 !
pip install --upgrade mistral_inference>= 1.4.0
Download
from huggingface_hub import snapshot_download from pathlib import Path mistral_models_path = Path.home().joinpath('mistral_models', 'Pixtral') mistral_models_path.mkdir(parents=True, exist_ok=True) snapshot_download(repo_id = "mistralai/Pixtral-12B-2409", allow_patterns = ["params.json", "consolidated.safetensors", "tekken.json"], local_dir = mistral_models_path)CLI example
mistral-chat $HOME/mistral_models/Pixtral --instruct --max_tokens 256 --temperature 0.35E.g.
Try out something like:
Text prompt: What can you see on the following picture?
[You can input zero, one or more images now.]
Image path or url [Leave empty and press enter to finish image input]: https://picsum.photos/id/237/200/300
Image path or url [Leave empty and press enter to finish image input]:
I see a black dog lying on a wooden surface. The dog appears to be looking up, and its eyes are clearly visible.Python
- Load the model
from mistral_inference.transformer import Transformer from mistral_inference.generate import generate from mistral_common.tokens.tokenizers.mistral import MistralTokenizer from mistral_common.protocol.instruct.messages import UserMessage, TextChunk, ImageURLChunk from mistral_common.protocol.instruct.request import ChatCompletionRequest tokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tekken.json") model = Transformer.from_folder(mistral_models_path)- Run:
url = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png" prompt = "Describe the image." completion_request = ChatCompletionRequest(messages = [UserMessage(content = [ImageURLChunk(image_url = url), TextChunk(text = prompt)])]) encoded = tokenizer.encode_chat_completion(completion_request) images = encoded.images tokens = encoded.tokens out_tokens, _ = generate([tokens], model, images = [images], max_tokens = 256, temperature = 0.35, eos_id = tokenizer.instruct_tokenizer.tokenizer.eos_id) result = tokenizer.decode(out_tokens[0]) print(result)Assets
- Source code (zip) Mar 20
- Source code (tar.gz) Mar 20
- Jul 18, 2024
- Parsed from source:Jul 18, 2024
- Detected by Releasebot:Oct 28, 2025
- Modified by Releasebot:Dec 17, 2025
v1.3.0 Mistral-Nemo
Mistral‑Nemo‑Instruct‑2407 launches as a jointly trained instruct model from Mistral and NVIDIA. It features a 128k context window, 40 layers, multilingual training, and a drop‑in replacement for Mistral 7B, with Apache 2.0 license and ready pipelines.
Welcome
Welcome Mistral-Nemo from Mistral 🤝 NVIDIA
Read more about Mistral-Nemo here.Install
pip install mistral-inference>=1.3.0Download
export NEMO_MODEL=$HOME/12B_NEMO_MODEL wget https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-instruct-2407.tar mkdir -p $NEMO_MODEL tar -xf mistral-nemo-instruct-v0.1.tar -C $NEMO_MODELChat
mistral-chat $HOME/NEMO_MODEL --instruct --max_tokens 1024or directly in Python:
import os from mistral_inference.transformer import Transformer from mistral_inference.generate import generate from mistral_common.tokens.tokenizers.mistral import MistralTokenizer from mistral_common.protocol.instruct.messages import UserMessage from mistral_common.protocol.instruct.request import ChatCompletionRequest tokenizer = MistralTokenizer.from_model("mistral-nemo") model = Transformer.from_folder(os.environ.get("NEMO_MODEL")) prompt = "How expensive would it be to ask a window cleaner to clean all windows in Paris. Make a reasonable guess in US Dollar." completion_request = ChatCompletionRequest(messages = [UserMessage(content = prompt)]) tokens, out_tokens, _ = generate(tokenizer.encode_chat_completion(completion_request).tokens, model, max_tokens = 1024, temperature = 0.35, eos_id = tokenizer.instruct_tokenizer.tokenizer.eos_id) result = tokenizer.decode(out_tokens[0]) print(result)Function calling
from mistral_common.protocol.instruct.tool_calls import Function, Tool from mistral_inference.transformer import Transformer from mistral_inference.generate import generate from mistral_common.tokens.tokenizers.mistral import MistralTokenizer from mistral_common.protocol.instruct.messages import UserMessage from mistral_common.protocol.instruct.request import ChatCompletionRequest tokenizer = MistralTokenizer.from_model("mistral-nemo") model = Transformer.from_folder(os.environ.get("NEMO_MODEL")) completion_request = ChatCompletionRequest( tools = [ Tool( function = Function( name = "get_current_weather", description = "Get the current weather", parameters = { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA", }, "format": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location.", }, }, "required": ["location", "format"], }, ), name = "get_current_weather", ), ], messages = [UserMessage(content = "What's the weather like today in Paris?")], ) tokens, out_tokens, _ = generate(tokenizer.encode_chat_completion(completion_request).tokens, model, max_tokens = 256, temperature = 0.35, eos_id = tokenizer.instruct_tokenizer.tokenizer.eos_id) result = tokenizer.decode(out_tokens[0]) print(result)Summary
The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the
Mistral-Nemo-Base-2407. Trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size.For more details about this model please refer to our release blog post.
Key features
- Released under the Apache 2 License
- Pre-trained and instructed versions
- Trained with a 128k context window
- Trained on a large proportion of multilingual and code data
- Drop-in replacement of Mistral 7B
Model Architecture
Mistral Nemo is a transformer model, with the following architecture choices:
- Layers: 40
- Dim: 5,120
- Head dim: 128
- Hidden dim: 14,436
- Activation Function: SwiGLU
- Number of heads: 32
- Number of kv-heads: 8 (GQA)
- Vocabulary size: 2**17 ~= 128k
- Rotary embeddings (theta = 1M)
Metrics
Main Benchmarks
Benchmark Score
HellaSwag (0-shot) 83.5%
Winogrande (0-shot) 76.8%
OpenBookQA (0-shot) 60.6%
CommonSenseQA (0-shot) 70.4%
TruthfulQA (0-shot) 50.3%
MMLU (5-shot) 68.0%
TriviaQA (5-shot) 73.8%
NaturalQuestions (5-shot) 31.2%Multilingual Benchmarks (MMLU)
Language Score
French 62.3%
German 62.7%
Spanish 64.6%
Italian 61.3%
Portuguese 63.3%
Russian 59.2%
Chinese 59.0%
Japanese 59.0%What's Changed
- Tekken by @patrickvonplaten in #193
Full Changelog: v1.2.0...v1.3.0
Original source Report a problem - Jul 16, 2024
- Parsed from source:Jul 16, 2024
- Detected by Releasebot:Oct 28, 2025
- Modified by Releasebot:Nov 16, 2025
v1.2.0 Add Mamba
Mistral introduces Codestral-Mamba and Mathstral 7B models with ready to run chat demos and updated setup steps. The release highlights README fixes, typo corrections, and new contributors as it advances from v1.1.0 to v1.2.0.
Welcome 🐍 Codestral-Mamba and 🔢 Mathstral
pip install mistral-inference>=1.2.0
Codestral-Mamba
pip install packaging mamba-ssm causal-conv1d transformers
1. Downloadexport MAMBA_CODE=$HOME/7B_MAMBA_CODE
2. Chat
wget https://models.mistralcdn.com/codestral-mamba-7b-v0-1/codestral-mamba-7B-v0.1.tar
mkdir -p $MAMBA_CODE
tar -xf codestral-mamba-7B-v0.1.tar -C $MAMBA_CODEmistral-chat $HOME/7B_MAMBA_CODE --instruct --max_tokens 256
Mathstral
1. Downloadexport MATHSTRAL=$HOME/7B_MATH
2. Chat
wget https://models.mistralcdn.com/mathstral-7b-v0-1/mathstral-7B-v0.1.tar
mkdir -p $MATHSTRAL
tar -xf mathstral-7B-v0.1.tar -C $MATHSTRALmistral-chat $HOME/7B_MATH --instruct --max_tokens 256
Blogs:
- Blog Codestral Mamba 7B: https://mistral.ai/news/codestral-mamba/
- Blog Mathstral 7B: https://mistral.ai/news/mathstral/
What's Changed
- add a note about GPU requirement by @sophiamyang in #158
- Add codestral by @patrickvonplaten in #164
- Update README.md by @patrickvonplaten in #165
- fixing type in README.md by @didier-durand in #175
- Fix: typo in ModelArgs: "infered" to "inferred" by @CharlesCNorton in #174
- fix: typo in LoRALoaderMixin: correct "multipe" to "multiple" by @CharlesCNorton in #173
- fix: Correct typo in classifier.ipynb from "alborithm" to "algorithm" by @CharlesCNorton in #167
- Fix: typo in error message for state_dict validation by @CharlesCNorton in #172
- fix: Correct misspelling in ModelArgs docstring by @CharlesCNorton in #171
- Update README.md by @patrickvonplaten in #168
- fix: typo in HF_TOKEN environment variable check message by @CharlesCNorton in #179
- Adding Issue/Bug template. by @pandora-s-git in #178
- typo in ModelArgs class docstring. by @CharlesCNorton in #183
- Update README.md by @Simontwice in #184
- Add mamba by @patrickvonplaten in #187
New Contributors
- @didier-durand made their first contribution in #175
- @CharlesCNorton made their first contribution in #174
- @pandora-s-git made their first contribution in #178
- @Simontwice made their first contribution in #184
Full Changelog: v1.1.0...v1.2.0
Original source Report a problem - May 24, 2024
- Parsed from source:May 24, 2024
- Detected by Releasebot:Oct 28, 2025
- Modified by Releasebot:Dec 11, 2025
v1.1.0 Add LoRA
Mistral inference 1.1.0 adds support for running LoRA models trained with mistral-finetune, enabling end users to load a 7B base LoRA and execute inference via a streamlined Python API. The note walks through tokenizer and model setup along with a sample generation flow.
mistral-inference==1.1.0 supports running LoRA models that were trained with: https://github.com/mistralai/mistral-finetune
Having trained a 7B base LoRA, you can run mistral-inference as follows:
from mistral_inference.model import Transformer from mistral_inference.generate import generate from mistral_common.tokens.tokenizers.mistral import MistralTokenizer from mistral_common.protocol.instruct.messages import UserMessage from mistral_common.protocol.instruct.request import ChatCompletionRequest MODEL_PATH = "path/to/downloaded/7B_base_dir" tokenizer = MistralTokenizer.from_file(f"{MODEL_PATH}/tokenizer.model.v3") # change to extracted tokenizer file model = Transformer.from_folder(MODEL_PATH) # change to extracted model dir model.load_lora("/path/to/run_lora_dir/checkpoints/checkpoint_000300/consolidated/lora.safetensors") completion_request = ChatCompletionRequest(messages = [UserMessage(content = "Explain Machine Learning to me in a nutshell.")]) tokens = tokenizer.encode_chat_completion(completion_request).tokens out_tokens, _ = generate(tokens, model, max_tokens = 64, temperature = 0.0, eos_id = tokenizer.instruct_tokenizer.tokenizer.eos_id) result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0]) print(result)Assets 2
5 people reacted
Original source Report a problem - May 22, 2024
- Parsed from source:May 22, 2024
- Detected by Releasebot:Nov 4, 2025
- Modified by Releasebot:Dec 28, 2025
v1.0.4 - Mistral-inference
Mistral-inference is the official inference library for Mistral models 7B, 8x7B and 8x22B with simple install and run steps. This marks a ready‑to‑use tool for developers to deploy models in apps.
Mistral-inference is the official inference library for all Mistral models: 7B, 8x7B, 8x22B.
Install with:
pip install mistral-inferenceRun with:
from mistral_inference.model import Transformer from mistral_inference import generate from mistral_common.tokens.tokenizers.mistral import MistralTokenizer from mistral_common.protocol.instruct.messages import UserMessage from mistral_common.protocol.instruct.request import ChatCompletionRequest completion_request = ChatCompletionRequest(tools = [Tool(function = Function(name = "get_current_weather", description = "Get the current weather", parameters = {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location."}}, "required": ["location", "format"]})})], messages = [UserMessage(content = "What's the weather like today in Paris?")]) tokens = tokenizer.encode_chat_completion(completion_request).tokens out_tokens, _ = generate([tokens], model, max_tokens = 64, temperature = 0.0, eos_id = tokenizer.instruct_tokenizer.tokenizer.eos_id) result = tokenizer.decode(out_tokens[0]) print(result)Assets 2
Original source Report a problem
This is the end. You've seen all the release notes in this feed!