Together AI Release Notes

Last updated: Mar 20, 2026

  • Mar 10, 2026
    • Date parsed from source:
      Mar 10, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Mar 10

    Together AI adds cached input token pricing for MiniMaxAI/MiniMax-M2.5 at $0.06 per 1M tokens.

    Cached Input Token Pricing

    Cached input token pricing is now available:

    • MiniMaxAI/MiniMax-M2.5: $0.06 per 1M cached input tokens (80% off standard input price)
    Original source Report a problem
  • Mar 7, 2026
    • Date parsed from source:
      Mar 7, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Mar 7

    Together AI adds serverless model bring-ups with Qwen/Qwen3.5-9B available.

    Serverless Model Bring Ups

    The following models have been added:

    Qwen/Qwen3.5-9B

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from Together AI and hundreds of other software products.

  • Mar 6, 2026
    • Date parsed from source:
      Mar 6, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Mar 6

    Together AI deprecates several models and removes them from availability.

    Model Deprecations

    The following models have been deprecated and are no longer available:

    • mixedbread-ai/Mxbai-Rerank-Large-V2
    • moonshotai/Kimi-K2-Thinking
    • meta-llama/Llama-3.2-3B-Instruct-Turbo
    • moonshotai/Kimi-K2-Instruct-0905
    Original source Report a problem
  • Feb 25, 2026
    • Date parsed from source:
      Feb 25, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Feb 25

    Together AI deprecates multiple models, including FLUX, Qwen, Llama, and Nemotron variants.

    Model Deprecations

    The following models have been deprecated and are no longer available:

    • black-forest-labs/FLUX.1-dev
    • black-forest-labs/FLUX.1-dev-lora
    • black-forest-labs/FLUX.1-kontext-dev
    • Qwen/Qwen3-VL-32B-Instruct
    • mistralai/Ministral-3-14B-Instruct-2512
    • Qwen/Qwen3-Next-80B-A3B-Thinking
    • Alibaba-NLP/gte-modernbert-base
    • BAAI/bge-base-en-v1.5
    • meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
    • meta-llama/Llama-Guard-3-11B-Vision-Turbo
    • meta-llama/LlamaGuard-2-8b
    • marin-community/marin-8b-instruct
    • nvidia/NVIDIA-Nemotron-Nano-9B-v2
    Original source Report a problem
  • Feb 16, 2026
    • Date parsed from source:
      Feb 16, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Feb 16

    Together AI adds serverless model bring ups for Qwen/Qwen3.5-397B-A17B.

    Serverless Model Bring Ups

    The following models have been added:

    • Qwen/Qwen3.5-397B-A17B
    Original source Report a problem
  • Feb 15, 2026
    • Date parsed from source:
      Feb 15, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Feb 15

    Together AI adds serverless model bring-up support for MiniMaxAI/MiniMax-M2.5.

    Serverless Model Bring Ups

    The following models have been added:

    • MiniMaxAI/MiniMax-M2.5
    Original source Report a problem
  • Feb 13, 2026
    • Date parsed from source:
      Feb 13, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Feb 13

    Together AI adds zai-org/GLM-5 to Serverless Model Bring Ups.

    Serverless Model Bring Ups

    The following models have been added:

    • zai-org/GLM-5
    Original source Report a problem
  • Feb 12, 2026
    • Date parsed from source:
      Feb 12, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Feb 12

    Together AI launches Dedicated Container Inference to help users containerize, deploy, and scale custom models.

    Dedicated Container Inference Launch

    Together AI has officially launched Dedicated Container Inference (DCI), formerly known as BYOC.

    DCI empowers users to containerize, deploy, and scale custom models on Together AI with ease.

    • Blog post
    • Documentation
    • Getting started
    Original source Report a problem
  • Feb 6, 2026
    • Date parsed from source:
      Feb 6, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Feb 6

    Together AI deprecates several models, including Llama, Qwen, and BGE variants, which are no longer available.

    Model Deprecations

    The following models have been deprecated and are no longer available:

    • togethercomputer/m2-bert-80M-32k-retrieval
    • Salesforce/Llama-Rank-V1
    • togethercomputer/Refuel-Llm-V2
    • togethercomputer/Refuel-Llm-V2-Small
    • Qwen/Qwen3-235B-A22B-fp8-tput
    • qwen-qwen2-5-14b-instruct-lora
    • meta-llama/Llama-4-Scout-17B-16E-Instruct
    • Qwen/Qwen2.5-72B-Instruct-Turbo
    • meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
    • BAAI/bge-large-en-v1.5
    Original source Report a problem
  • Feb 4, 2026
    • Date parsed from source:
      Feb 4, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Together AI logo

    Together AI

    Feb 4

    Together AI releases Python SDK v2.0 General Availability, bringing a faster, type-safe OpenAPI-driven client that is easier to maintain and aligned with the latest API surface. It also includes beta APIs for Instant Clusters and becomes the new home for future features.

    Python SDK v2.0 General Availability

    Together AI is releasing the Python SDK v2.0 — a new, type-safe, OpenAPI-driven client designed to be faster, easier to maintain, and ready for everything we’re building next.

    • Install:

      pip install together
      

      or

      uv add together
      
    • Migration Guide: A detailed Python SDK Migration Guide covers API-by-API changes, type updates, and troubleshooting tips

    • Code and Docs: Access the Together Python v2 repo and reference docs with code examples

    • Main Goal: Replace the legacy v1 Python SDK with a modern, strongly-typed, OpenAPI-generated client that matches the API surface more closely and stays in lock-step with new features

    • Net New: All new features will be built in version 2 moving forward. This first version already includes beta APIs for our Instant Clusters!

    Original source Report a problem

Related vendors