- Apr 28, 2026
- Date parsed from source:Apr 28, 2026
- First seen by Releasebot:Apr 29, 2026
Model Deprecation: MiniMax M2.5
Together AI deprecates MiniMaxAI/MiniMax-M2.5 on serverless and recommends migrating to MiniMaxAI/MiniMax-M2.7.
MiniMaxAI/MiniMax-M2.5 has been deprecated and is no longer available on serverless. We recommend migrating to MiniMaxAI/MiniMax-M2.7.
Original source - Apr 24, 2026
- Date parsed from source:Apr 24, 2026
- First seen by Releasebot:Apr 25, 2026
Serverless Model Bring Up: DeepSeek-V4-Pro
Together AI adds DeepSeek-V4-Pro to serverless with 512K context, function calling, structured outputs, and FP4 pricing.
deepseek-ai/DeepSeek-V4-Pro has been added to serverless.
- Context length: 512,000
- Pricing: $2.10 input / $4.40 output / $0.20 cached input (per 1M tokens)
- Quantization: FP4
- Function calling and structured outputs supported
All of your release notes in one feed
Join Releasebot and get updates from Together AI and hundreds of other software products.
- Apr 24, 2026
- Date parsed from source:Apr 24, 2026
- First seen by Releasebot:Apr 25, 2026
Pricing Update: No-Packing Fine-Tuning Jobs
Together AI updates no-packing fine-tuning pricing and lets users control costs with configurable max sequence length.
We rolled out a pricing update for no-packing fine-tuning jobs. When the no-packing option is chosen, the number of training dataset tokens is now calculated as len(dataset) * max_seq_length to account for the compute used by packing-free jobs.
- max_seq_length is configurable in both the SDK and UI.
- Price prediction reflects these changes, so if no-packing is chosen you can control the cost of the job by adjusting the sequence length.
- Apr 24, 2026
- Date parsed from source:Apr 24, 2026
- First seen by Releasebot:Apr 16, 2026
- Modified by Releasebot:Apr 25, 2026
Serverless Model Bring Ups
Together AI adds new models including Cogito, Veo, Vidu, and Wan to expand its AI lineup.
The following models have been added:
- deepcogito/cogito-v2-1-671b
- google/veo-3.1-test-debug
- vidu/vidu-q3
- vidu/vidu-q3-turbo
- Wan-AI/wan2.7-i2v
- Wan-AI/wan2.7-r2v
- Apr 22, 2026
- Date parsed from source:Apr 22, 2026
- First seen by Releasebot:Apr 25, 2026
Dynamic Rate Limits & Prepaid Billing
Together AI retires tier labels and moves all users to dynamic rate limits with a fully prepaid billing model.
Build Tiers 1–5, Scale, and Enterprise tier labels have been retired. Dynamic rate limits are now live for all users.
Billing has moved to a fully prepaid model.
Model-specific tier gates have been removed. The platform-wide $5 credit purchase is the only gate.
- Apr 22, 2026
- Date parsed from source:Apr 22, 2026
- First seen by Releasebot:Apr 16, 2026
- Modified by Releasebot:Apr 23, 2026
Serverless Model Bring Ups
Together AI adds moonshotai/Kimi-K2.6 to its model lineup.
- Apr 15, 2026
- Date parsed from source:Apr 15, 2026
- First seen by Releasebot:Apr 16, 2026
Pricing Update
Together AI updates google/gemma-3n-E4B-it pricing for April 15, 2026, raising input and output token rates.
The following model has updated pricing, effective April 15, 2026:
google/gemma-3n-E4B-it pricing updated
$0.02 → $0.06 (input), $0.04 → $0.12 (output) per 1M tokens
Original source - Apr 14, 2026
- Date parsed from source:Apr 14, 2026
- First seen by Releasebot:Apr 15, 2026
- Modified by Releasebot:Apr 16, 2026
Model Deprecations
Together AI deprecates several models and removes them from availability.
The following models have been deprecated and are no longer available:
- Qwen/Qwen3-VL-8B-Instruct
- Qwen/Qwen3-235B-A22B-Thinking-2507
- mistralai/Mixtral-8x7B-Instruct-v0.1
- Apr 11, 2026
- Date parsed from source:Apr 11, 2026
- First seen by Releasebot:Mar 20, 2026
- Modified by Releasebot:Apr 12, 2026
Apr 11
Together AI adds Serverless Model Bring Ups with MiniMaxAI/MiniMax-M2.7 now available.
Serverless Model Bring Ups
The following models have been added:
- MiniMaxAI/MiniMax-M2.7
- Apr 8, 2026
- Date parsed from source:Apr 8, 2026
- First seen by Releasebot:Apr 9, 2026
Apr 8
Together AI adds serverless model bring ups for google/gemma-4-31B-it and zai-org/GLM-5.1.
Serverless Model Bring Ups
The following models have been added:
- google/gemma-4-31B-it
- zai-org/GLM-5.1
- Apr 2, 2026
- Date parsed from source:Apr 2, 2026
- First seen by Releasebot:Mar 31, 2026
- Modified by Releasebot:Apr 16, 2026
Model Deprecations
Together AI removes deprecated models from availability, including GLM, Mistral, and Qwen options.
The following models have been deprecated and are no longer available:
- zai-org/GLM-4.5-Air-FP8
- zai-org/GLM-4.7
- mistralai/Mistral-Small-24B-Instruct-2501
- Qwen/Qwen3-Next-80B-A3B-Instruct
- Mar 31, 2026
- Date parsed from source:Mar 31, 2026
- First seen by Releasebot:Mar 31, 2026
- Modified by Releasebot:Apr 16, 2026
Model Deprecation
Together AI deprecates meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 and removes it from availability.
The following model has been deprecated and is no longer available:
Original sourcemeta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 - Mar 10, 2026
- Date parsed from source:Mar 10, 2026
- First seen by Releasebot:Mar 20, 2026
Mar 10
Together AI adds cached input token pricing for MiniMaxAI/MiniMax-M2.5 at $0.06 per 1M tokens.
Cached Input Token Pricing
Cached input token pricing is now available:
- MiniMaxAI/MiniMax-M2.5: $0.06 per 1M cached input tokens (80% off standard input price)
- Mar 7, 2026
- Date parsed from source:Mar 7, 2026
- First seen by Releasebot:Mar 20, 2026
Mar 7
Together AI adds serverless model bring-ups with Qwen/Qwen3.5-9B available.
- Mar 6, 2026
- Date parsed from source:Mar 6, 2026
- First seen by Releasebot:Mar 20, 2026
Mar 6
Together AI deprecates several models and removes them from availability.
Model Deprecations
The following models have been deprecated and are no longer available:
- mixedbread-ai/Mxbai-Rerank-Large-V2
- moonshotai/Kimi-K2-Thinking
- meta-llama/Llama-3.2-3B-Instruct-Turbo
- moonshotai/Kimi-K2-Instruct-0905