Cohere Release Notes

Last updated: Mar 20, 2026

  • Dec 11, 2025
    • Date parsed from source:
      Dec 11, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Cohere's Rerank v4.0 Model is Here!

    Cohere releases Rerank 4.0, its newest foundational ranking model with two variants for quality or speed, multilingual and JSON document support, and a 32k token context window for longer, more capable re-ranking.

    We're pleased to announce the release of Rerank 4.0 our newest and most performant foundational model for ranking.

    Technical Details

    • Two model variants available:rerank-v4.0-pro: Optimized for state-of-the-art quality and complex use-cases
    • rerank-v4.0-fast: Optimized for low latency and high throughput use-cases
    • Multilingual support: Re-rank both English and non-English documents
    • Semi-structured data support: Re-rank JSON documents
    • Extended context length: 32k token context window

    Example Query

    import cohere
    co = cohere.ClientV2()
    query = "What is the capital of the United States?"
    docs = [
    "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.",
    "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.",
    "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.",
    "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.",
    "Capital punishment has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment.",
    ]
    results = co.rerank(
    model="rerank-v4.0-pro", query=query, documents=docs, top_n=5
    )
    
    Original source Report a problem
  • Sep 16, 2025
    • Date parsed from source:
      Sep 16, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Announcing Major Command Deprecations

    Cohere deprecates several legacy models, fine-tuning options, API endpoints, and UI apps, while steering users to newer command-r models and command-a for replacements.

    As part of our ongoing commitment to delivering advanced AI solutions, we are deprecating the following models, features, and API endpoints:

    Deprecated Models

    • command-r-03-2024 (and the alias command-r)
    • command-r-plus-04-2024 (and the alias command-r-plus)
    • command-light
    • command
    • summarize (Refer to the migration guide for alternatives).

    For command model replacements, we recommend you use command-r-08-2024, command-r-plus-08-2024, or command-a-03-2025 (which is the strongest-performing model across domains) instead.

    Retired Fine-Tuning Capabilities

    All fine-tuning options via dashboard and API for models including command-light, command, command-r, classify, and rerank are being retired. Previously fine-tuned models will no longer be accessible.

    Deprecated Features and API Endpoints

    • /v1/connectors (Managed connectors for RAG)
    • /v1/chat parameters: connectors, search_queries_only
    • /v1/generate (Legacy generative endpoint)
    • /v1/summarize (Legacy summarization endpoint)
    • /v1/classify
    • Slack App integration
    • Coral Web UI (chat.cohere.com and coral.cohere.com)

    For questions, reach out to [email protected]

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from Cohere and hundreds of other software products.

  • Aug 28, 2025
    • Date parsed from source:
      Aug 28, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Announcing Cohere's Command A Translate Model

    Cohere launches Command A Translate, its first machine translation model, bringing accurate, fluent translations across 23 languages with long-context support, strong deployment efficiency, and secure options for sensitive data. It is now available through Cohere’s Chat API and standard endpoints.

    We're excited to announce the release of Command A Translate, Cohere's first machine translation model. It achieves state-of-the-art performance at producing accurate, fluent translations across 23 languages.

    Key Features

    • 23 supported languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Chinese, Arabic, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian
    • 111 billion parameters for superior translation quality
    • 16K token context length (8K input + 8K output) for handling longer texts
    • Optimized for deployment on 1-2 GPUs (A100s/H100s)
    • Secure deployment options for sensitive data translation

    Getting Started

    The model is available immediately through Cohere's Chat API endpoint. You can start translating text with simple prompts or integrate it programmatically into your applications.

    from cohere import ClientV2
    co = ClientV2(api_key="<YOUR API KEY>")
    response = co.chat(
    model="command-a-translate-08-2025",
    messages=[
    {
    "role": "user",
    "content": "Translate this text to Spanish: Hello, how are you?",
    }
    ],
    )
    

    Availability

    Command A Translate (command-a-translate-08-2025) is now available for all Cohere users through our standard API endpoints. For enterprise customers, private deployment options are available to ensure maximum security and control over your translation workflows.

    For more detailed information about Command A Translate, including technical specifications and implementation examples, visit our model documentation.

    Original source Report a problem
  • Aug 21, 2025
    • Date parsed from source:
      Aug 21, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Announcing Cohere's Command A Reasoning Model

    Cohere releases Command A Reasoning, a hybrid reasoning model for complex agentic tasks in 23 languages, with 111 billion parameters, a 256K context length, strong tool use, and support through the familiar Command API.

    We’re excited to announce the release of Command A Reasoning, a hybrid reasoning model designed to excel at complex agentic tasks, in English and 22 other languages. With 111 billion parameters and a 256K context length, this model brings advanced reasoning capabilities to your applications through the familiar Command API interface.

    Key Features

    • Tool Use: Provides the strongest tool use performance out of the Command family of models.
    • Agentic Applications: Demonstrates proactive problem-solving, autonomously using tools and resources to complete highly complex tasks.
    • Multilingual: With 23 languages supported, the model solves reasoning and agentic problems in the language your business operates in.

    Technical Specifications

    • Model Name: command-a-reasoning-08-2025
    • Context Length: 256K tokens
    • Maximum Output: 32K tokens
    • API Endpoint: Chat API

    Getting Started

    Integrating Command A Reasoning is straightforward using the Chat API. Here’s a non-streaming example:

    Customization Options

    You can enable and disable thinking capabilities using the thinking parameter, and steer the model's output with a flexible user-controlled thinking budget; for more details on token budgets, advanced configurations, and best practices, refer to our dedicated Reasoning documentation.

    Original source Report a problem
  • Jul 31, 2025
    • Date parsed from source:
      Jul 31, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Announcing Cohere's Command A Vision Model

    Cohere releases Command A Vision, its first commercial model for understanding images and text together, bringing enterprise-grade multimodal capabilities to the Command API for document analysis, chart interpretation, OCR, and more across multiple languages.

    We're excited to announce the release of Command A Vision, Cohere's first commercial model capable of understanding and interpreting visual data alongside text. This addition to our Command family brings enterprise-grade vision capabilities to your applications with the same familiar Command API interface.

    Key Features

    Multimodal Capabilities

    • Text + Image Processing: Combine text prompts with image inputs
    • Enterprise-Focused Use Cases: Optimized for business applications like document analysis, chart interpretation, and OCR
    • Multiple Languages: Officially supports English, Portuguese, Italian, French, German, and Spanish

    Technical Specifications

    • Model Name: command-a-vision-07-2025
    • Context Length: 128K tokens
    • Maximum Output: 8K tokens
    • Image Support: Up to 20 images per request (or 20MB total)
    • API Endpoint: Chat API

    What You Can Do

    Command A Vision excels in enterprise use cases including:

    • 📊 Chart & Graph Analysis: Extract insights from complex visualizations
    • 📋 Table Understanding: Parse and interpret data tables within images
    • 📄 Document OCR: Optical character recognition with natural language processing
    • 🌐 Image Processing for Multiple Languages: Handle text in images across multiple languages
    • 🔍 Scene Analysis: Identify and describe objects within images

    💻 Getting Started

    The API structure is identical to our existing Command models, making integration straightforward:

    import cohere
    co = cohere.Client("your-api-key")
    response = co.chat(
    model="command-a-vision-07-2025",
    messages=[
    {
    "role": "user",
    "content": [
    {
    "type": "text",
    "text": "Analyze this chart and extract the key data points",
    },
    {
    "type": "image_url",
    "image_url": {"url": "your-image-url"},
    },
    ],
    }
    ],
    )
    

    There's much more to be said about working with images, various limitations, and best practices, which you can find in our dedicated Command A Vision and Image Inputs documents.

    Original source Report a problem
  • Apr 15, 2025
    • Date parsed from source:
      Apr 15, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Announcing Embed Multimodal v4

    Cohere releases Embed 4, its most performant search model yet, with Matryoshka and unified mixed-modality embeddings, 128k context length, and state-of-the-art retrieval for text, images, and mixed content. It is available on the Cohere Platform, AWS Sagemaker, and Azure AI Foundry.

    We’re thrilled to announce the release of Embed 4, the most recent entrant into the Embed family of enterprise-focused large language models (LLMs).

    Embed v4 is Cohere’s most performant search model to date, and supports the following new features:

    • Matryoshka Embeddings in the following dimensions: '[256, 512, 1024, 1536]'
    • Unified Embeddings produced from mixed modality input (i.e. a single payload of image(s) and text(s))
    • Context length of 128k

    Embed v4 achieves state of the art in the following areas:

    • Text-to-text retrieval
    • Text-to-image retrieval
    • Text-to-mixed modality retrieval (from e.g. PDFs)

    Embed v4 is available today on the Cohere Platform, AWS Sagemaker, and Azure AI Foundry. For more information, check out our dedicated blog post here.

    Original source Report a problem
  • Mar 13, 2025
    • Date parsed from source:
      Mar 13, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Announcing Command A

    Cohere releases Command A, its most performant enterprise LLM yet, built for tool use, RAG, agents, and multilingual tasks. The model brings 111B parameters, 256K context, higher inference efficiency, and is available on the Cohere Platform, HuggingFace, and SDK.

    Command A

    We're thrilled to announce the release of Command A, the most recent entrant into the Command family of enterprise-focused large language models (LLMs).

    Command A is Cohere's most performant model to date, excelling at real world enterprise tasks including tool use, retrieval augmented generation (RAG), agents, and multilingual use cases. With 111B parameters and a context length of 256K, Command A boasts a considerable increase in inference-time efficiency -- 150% higher throughput compared to its predecessor Command R+ 08-2024 -- and only requires two GPUs (A100s / H100s) to run.

    Command A is available today on the Cohere Platform, HuggingFace, or through the SDK with command-a-03-2025. For more information, check out our dedicated blog post.

    Original source Report a problem
  • Mar 4, 2025
    • Date parsed from source:
      Mar 4, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Our Groundbreaking Multimodal Model, Aya Vision, is Here!

    Cohere launches Aya Vision, a multilingual multimodal AI model for image captioning, visual Q&A, text generation and translation.

    Today, Cohere Labs, Cohere’s research arm, is proud to announce Aya Vision, a state-of-the-art multimodal large language model excelling across multiple languages and modalities. Aya Vision outperforms the leading open-weight models in critical benchmarks for language, text, and image capabilities.

    built as a foundation for multilingual and multimodal communication, this groundbreaking AI model supports tasks such as image captioning, visual question answering, text generation, and translations from both texts and images into coherent text.

    Find more information about Aya Vision here.

    Original source Report a problem
  • Feb 27, 2025
    • Date parsed from source:
      Feb 27, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Cohere Releases Arabic-Optimized Command Model!

    Cohere releases Command R7B Arabic, an open weights 8B model tuned for Arabic and English with 128K context. It brings strong enterprise performance for instruction following, RAG, length control, and reduced code-switching, plus chat and multilingual RAG support.

    Cohere is thrilled to announce the release of Command R7B Arabic (c4ai-command-r7b-12-2024). This is an open weights release of an advanced, 8-billion parameter custom model optimized for the Arabic language (MSA dialect), in addition to English. As with Cohere's other command models, this one comes with context length of 128,000 tokens; it excels at a number of critical enterprise tasks -- instruction following, length control, retrieval-augmented generation (RAG), minimizing code-switching -- and it demonstrates excellent general purpose knowledge and understanding of the Arabic language and culture.

    Try Command R7B Arabic

    If you want to try Command R7B Arabic, it's very easy: you can use it through the Cohere playground or in our dedicated Hugging Face Space.

    Alternatively, you can use the model in your own code. To do that, first install the transformers library from its source repository:

    pip install 'git+https://github.com/huggingface/transformers.git'
    

    Then, use this Python snippet to run a simple text-generation task with the model:

    from transformers import AutoTokenizer, AutoModelForCausalLM
    model_id = "CohereForAI/c4ai-command-r7b-12-2024"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(model_id)
    # Format message with the c4ai-command-r7b-12-2024 chat template
    messages = [{"role": "user", "content": "مرحبا، كيف حالك؟"}]
    input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
    )
    gen_tokens = model.generate(
    input_ids,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.3,
    )
    gen_text = tokenizer.decode(gen_tokens[0])
    print(gen_text)
    

    Chat Capabilities

    Command R7B Arabic can be operated in two modes, "conversational" and "instruct" mode:

    Conversational mode conditions the model on interactive behaviour, meaning it is expected to reply in a conversational fashion, provide introductory statements and follow-up questions, and use Markdown as well as LaTeX where appropriate. This mode is optimized for interactive experiences, such as chatbots, where the model engages in dialogue.

    Instruct mode conditions the model to provide concise yet comprehensive responses, and to not use Markdown or LaTeX by default. This mode is designed for non-interactive, task-focused use cases such as extracting information, summarizing text, translation, and categorization.

    Multilingual RAG Capabilities

    Command R7B Arabic has been trained specifically for Arabic and English tasks, such as the generation step of Retrieval Augmented Generation (RAG).

    Command R7B Arabic's RAG functionality is supported through chat templates in Transformers. Using our RAG chat template, the model takes a conversation (with an optional user-supplied system preamble) and a list of document snippets as input. The resulting output contains a response with in-line citations. Here's what that looks like:

    # Define conversation input
    conversation = [
    {
    "role": "user",
    "content": "اقترح طبقًا يمزج نكهات من عدة دول عربية",
    }
    ]
    # Define documents for retrieval-based generation
    documents = [
    {
    "heading": "المطبخ العربي: أطباقنا التقليدية",
    "body": "يشتهر المطبخ العربي بأطباقه الغنية والنكهات الفريدة. في هذا المقال، سنستكشف ...",
    },
    {
    "heading": "وصفة اليوم: مقلوبة",
    "body": "المقلوبة هي طبق فلسطيني تقليدي، يُحضر من الأرز واللحم أو الدجاج والخضروات. في وصفتنا اليوم ...",
    },
    ]
    # Get the RAG prompt
    input_prompt = tokenizer.apply_chat_template(
    conversation=conversation,
    documents=documents,
    tokenize=False,
    add_generation_prompt=True,
    return_tensors="pt",
    )
    # Tokenize the prompt
    input_ids = tokenizer.encode_plus(input_prompt, return_tensors="pt")
    

    You can then generate text from this input as normal.

    Notes on Usage

    We recommend document snippets be short chunks (around 100-400 words per chunk) instead of long documents. They should also be formatted as key-value pairs, where the keys are short descriptive strings and the values are either text or semi-structured.

    You may find that simply including relevant documents directly in a user message works as well as or better than using the documents parameter to render the special RAG template (though the template is a strong default for those wanting citations). We encourage users to experiment with both approaches, and to evaluate which mode works best for their specific use case.

    Original source Report a problem
  • Feb 26, 2025
    • Date parsed from source:
      Feb 26, 2025
    • First seen by Releasebot:
      Mar 20, 2026
    Cohere logo

    Cohere

    Cohere via OpenAI SDK Using Compatibility API

    Cohere releases a Compatibility API for using its models through OpenAI's SDK with chat, function calling, structured outputs, and embeddings.

    Today, we are releasing our Compatibility API, enabling developers to seamlessly use Cohere's models via OpenAI's SDK.

    This API enables you to switch your existing OpenAI-based applications to use Cohere's models without major refactoring.

    It includes comprehensive support for chat completions, such as function calling and structured outputs, as well as support for text embeddings generation.

    Check out our documentation on how to get started with the Compatibility API, with examples in Python, TypeScript, and cURL.

    Original source Report a problem

Related vendors