Together AI Release Notes

Last updated: Apr 29, 2026

Get this feed:

Apr 28, 2026
- Date parsed from source:
  Apr 28, 2026
- First seen by Releasebot:
  Apr 29, 2026
Together AI

Model Deprecation: MiniMax M2.5

Together AI deprecates MiniMaxAI/MiniMax-M2.5 on serverless and recommends migrating to MiniMaxAI/MiniMax-M2.7.

MiniMaxAI/MiniMax-M2.5 has been deprecated and is no longer available on serverless. We recommend migrating to MiniMaxAI/MiniMax-M2.7.
Original source
Apr 24, 2026
- Date parsed from source:
  Apr 24, 2026
- First seen by Releasebot:
  Apr 25, 2026
Together AI

Serverless Model Bring Up: DeepSeek-V4-Pro

Together AI adds DeepSeek-V4-Pro to serverless with 512K context, function calling, structured outputs, and FP4 pricing.
deepseek-ai/DeepSeek-V4-Pro has been added to serverless.

Context length: 512,000

Pricing: $2.10 input / $4.40 output / $0.20 cached input (per 1M tokens)

Quantization: FP4

Function calling and structured outputs supported

Original source
All of your release notes in one feed

Join Releasebot and get updates from Together AI and hundreds of other software products.

Create account
Get updates with:
Apr 24, 2026
- Date parsed from source:
  Apr 24, 2026
- First seen by Releasebot:
  Apr 25, 2026
Together AI

Pricing Update: No-Packing Fine-Tuning Jobs

Together AI updates no-packing fine-tuning pricing and lets users control costs with configurable max sequence length.
We rolled out a pricing update for no-packing fine-tuning jobs. When the no-packing option is chosen, the number of training dataset tokens is now calculated as len(dataset) * max_seq_length to account for the compute used by packing-free jobs.

max_seq_length is configurable in both the SDK and UI.

Price prediction reflects these changes, so if no-packing is chosen you can control the cost of the job by adjusting the sequence length.

Original source
Apr 24, 2026
- Date parsed from source:
  Apr 24, 2026
- First seen by Releasebot:
  Apr 16, 2026
- Modified by Releasebot:
  Apr 25, 2026
Together AI

Serverless Model Bring Ups

Together AI adds new models including Cogito, Veo, Vidu, and Wan to expand its AI lineup.
The following models have been added:

deepcogito/cogito-v2-1-671b

google/veo-3.1-test-debug

vidu/vidu-q3

vidu/vidu-q3-turbo

Wan-AI/wan2.7-i2v

Wan-AI/wan2.7-r2v

Original source
Apr 22, 2026
- Date parsed from source:
  Apr 22, 2026
- First seen by Releasebot:
  Apr 25, 2026
Together AI

Dynamic Rate Limits & Prepaid Billing

Together AI retires tier labels and moves all users to dynamic rate limits with a fully prepaid billing model.
Build Tiers 1–5, Scale, and Enterprise tier labels have been retired. Dynamic rate limits are now live for all users.

Billing has moved to a fully prepaid model.

Model-specific tier gates have been removed. The platform-wide $5 credit purchase is the only gate.

Original source
Apr 22, 2026
- Date parsed from source:
  Apr 22, 2026
- First seen by Releasebot:
  Apr 16, 2026
- Modified by Releasebot:
  Apr 23, 2026
Together AI

Serverless Model Bring Ups

Together AI adds moonshotai/Kimi-K2.6 to its model lineup.
The following models have been added:

moonshotai/Kimi-K2.6

Original source
Apr 15, 2026
- Date parsed from source:
  Apr 15, 2026
- First seen by Releasebot:
  Apr 16, 2026
Together AI

Pricing Update

Together AI updates google/gemma-3n-E4B-it pricing for April 15, 2026, raising input and output token rates.

The following model has updated pricing, effective April 15, 2026:

google/gemma-3n-E4B-it pricing updated

$0.02 → $0.06 (input), $0.04 → $0.12 (output) per 1M tokens
Original source
Apr 14, 2026
- Date parsed from source:
  Apr 14, 2026
- First seen by Releasebot:
  Apr 15, 2026
- Modified by Releasebot:
  Apr 16, 2026
Together AI

Model Deprecations

Together AI deprecates several models and removes them from availability.
The following models have been deprecated and are no longer available:

Qwen/Qwen3-VL-8B-Instruct

Qwen/Qwen3-235B-A22B-Thinking-2507

mistralai/Mixtral-8x7B-Instruct-v0.1

Original source
Apr 11, 2026
- Date parsed from source:
  Apr 11, 2026
- First seen by Releasebot:
  Mar 20, 2026
- Modified by Releasebot:
  Apr 12, 2026
Together AI

Apr 11

Together AI adds Serverless Model Bring Ups with MiniMaxAI/MiniMax-M2.7 now available.
Serverless Model Bring Ups

The following models have been added:

MiniMaxAI/MiniMax-M2.7

Original source
Apr 8, 2026
- Date parsed from source:
  Apr 8, 2026
- First seen by Releasebot:
  Apr 9, 2026
Together AI

Apr 8

Together AI adds serverless model bring ups for google/gemma-4-31B-it and zai-org/GLM-5.1.
Serverless Model Bring Ups

The following models have been added:

google/gemma-4-31B-it

zai-org/GLM-5.1

Original source
Apr 2, 2026
- Date parsed from source:
  Apr 2, 2026
- First seen by Releasebot:
  Mar 31, 2026
- Modified by Releasebot:
  Apr 16, 2026
Together AI

Model Deprecations

Together AI removes deprecated models from availability, including GLM, Mistral, and Qwen options.
The following models have been deprecated and are no longer available:

zai-org/GLM-4.5-Air-FP8

zai-org/GLM-4.7

mistralai/Mistral-Small-24B-Instruct-2501

Qwen/Qwen3-Next-80B-A3B-Instruct

Original source
Mar 31, 2026
- Date parsed from source:
  Mar 31, 2026
- First seen by Releasebot:
  Mar 31, 2026
- Modified by Releasebot:
  Apr 16, 2026
Together AI

Model Deprecation

Together AI deprecates meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 and removes it from availability.

The following model has been deprecated and is no longer available:

meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
Original source
Mar 10, 2026
- Date parsed from source:
  Mar 10, 2026
- First seen by Releasebot:
  Mar 20, 2026
Together AI

Mar 10

Together AI adds cached input token pricing for MiniMaxAI/MiniMax-M2.5 at $0.06 per 1M tokens.
Cached Input Token Pricing

Cached input token pricing is now available:

MiniMaxAI/MiniMax-M2.5: $0.06 per 1M cached input tokens (80% off standard input price)

Original source
Mar 7, 2026
- Date parsed from source:
  Mar 7, 2026
- First seen by Releasebot:
  Mar 20, 2026
Together AI

Mar 7

Together AI adds serverless model bring-ups with Qwen/Qwen3.5-9B available.

Serverless Model Bring Ups

The following models have been added:

Qwen/Qwen3.5-9B
Original source
Mar 6, 2026
- Date parsed from source:
  Mar 6, 2026
- First seen by Releasebot:
  Mar 20, 2026
Together AI

Mar 6

Together AI deprecates several models and removes them from availability.
Model Deprecations

The following models have been deprecated and are no longer available:

mixedbread-ai/Mxbai-Rerank-Large-V2

moonshotai/Kimi-K2-Thinking

meta-llama/Llama-3.2-3B-Instruct-Turbo

moonshotai/Kimi-K2-Instruct-0905

Original source

Together AI Release Notes

Model Deprecation: MiniMax M2.5

Serverless Model Bring Up: DeepSeek-V4-Pro

Pricing Update: No-Packing Fine-Tuning Jobs

Serverless Model Bring Ups

Dynamic Rate Limits & Prepaid Billing

Serverless Model Bring Ups

Pricing Update

google/gemma-3n-E4B-it pricing updated

Model Deprecations

Apr 11

Serverless Model Bring Ups

Apr 8

Serverless Model Bring Ups

Model Deprecations

Model Deprecation

Mar 10

Cached Input Token Pricing

Mar 7

Serverless Model Bring Ups

Mar 6

Model Deprecations

Related vendors