Together AI Release Notes

Follow

59 release notes curated from 1 source by the Releasebot Team. Last updated: Jun 18, 2026

Get this feed:
  • Jun 17, 2026
    • Date parsed from source:
      Jun 17, 2026
    • First seen by Releasebot:
      Jun 18, 2026
    Together AI logo

    Together AI

    June 17, 2026

    Together AI adds new serverless models, including zai-org/GLM-5.2 with long context, FP4 quantization, and function calling.

    New serverless models

    The following models are now available on serverless:

    • zai-org/GLM-5.2: 262K context length, FP4 quantization. Pricing: $1.40 input / $4.40 output / $0.26 cached input (per 1M tokens). Supports function calling and structured outputs.
    Original source
  • Jun 16, 2026
    • Date parsed from source:
      Jun 16, 2026
    • First seen by Releasebot:
      Jun 17, 2026
    Together AI logo

    Together AI

    June 16, 2026

    Together AI improves the Python SDK with duplicate file upload errors and reuse-friendly file IDs.

    Python SDK: duplicate file uploads now raise an error

    client.files.upload() in the Python SDK now raises a ValueError when the file’s contents already exist on Together AI. The error message includes the ID of the existing file so you can reuse it without re-uploading.

    To replace the file, delete the existing one first with client.files.delete(<file-id>) and retry the upload.

    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from Together AI and hundreds of other software products.

    Create account
  • Jun 13, 2026
    • Date parsed from source:
      Jun 13, 2026
    • First seen by Releasebot:
      Jun 16, 2026
    Together AI logo

    Together AI

    June 13, 2026

    Together AI adds serverless Kimi-K2.7-Code with 262K context, FP4, and function calling plus structured outputs.

    New serverless models

    The following models are now available on serverless:

    • moonshotai/Kimi-K2.7-Code: 262,144 context length, FP4 quantization. Pricing: $0.95 input / $4.00 output / $0.19 cached input (per 1M tokens). Supports function calling and structured outputs.
    Original source
  • Jun 12, 2026
    • Date parsed from source:
      Jun 12, 2026
    • First seen by Releasebot:
      Jun 16, 2026
    Together AI logo

    Together AI

    June 12, 2026

    Together AI adds new serverless MiniMax-M3 model support with 524K context and FP4 quantization.

    New serverless models

    The following models are now available on serverless:

    • MiniMaxAI/MiniMax-M3: 524,288 context length, FP4 quantization. Pricing: $0.30 input / $1.20 output / $0.06 cached input (per 1M tokens).
    Original source
  • Jun 11, 2026
    • Date parsed from source:
      Jun 11, 2026
    • First seen by Releasebot:
      Jun 12, 2026
    Together AI logo

    Together AI

    June 11, 2026

    Together AI deprecates mistralai/Voxtral-Mini-3B-2507 on serverless and points users to dedicated endpoints.

    Deprecations

    The following model has been deprecated and is no longer available on serverless:
    mistralai/Voxtral-Mini-3B-2507. Available as an on-demand dedicated endpoint.

    See Deprecations for migration options.

    Original source
  • Jun 9, 2026
    • Date parsed from source:
      Jun 9, 2026
    • First seen by Releasebot:
      Jun 10, 2026
    • Modified by Releasebot:
      Jun 16, 2026
    Together AI logo

    Together AI

    June 9, 2026

    Together AI updates serverless model pricing with new cached input rates and lower DeepSeek-V4-Pro input and output prices.

    Pricing update

    The following changes are effective June 9, 2026:

    New cached input pricing (per 1M tokens):

    • zai-org/GLM-5.1: $0.26 cached input (81% discount from $1.40 standard input).
    • Qwen/Qwen3.5-397B-A17B: $0.35 cached input (42% discount from $0.60 standard input).

    Price decrease for deepseek-ai/DeepSeek-V4-Pro (per 1M tokens):

    • Input: $2.10
    • Output: $4.40
    • Cached input: $0.20 (unchanged).

    See Serverless models for the full pricing catalog.

    Original source
  • Jun 8, 2026
    • Date parsed from source:
      Jun 8, 2026
    • First seen by Releasebot:
      Jun 10, 2026
    Together AI logo

    Together AI

    June 8, 2026

    Together AI adds server-side validation for fine-tuning datasets, giving uploaded files full schema checks during ingestion and clearer file status, validation reports, and user-facing errors to catch dataset issues before training starts.

    Improvements

    Server-side validation for fine-tuning datasets

    Files uploaded for fine-tuning now go through full server-side schema validation during ingestion, with the result exposed on the file object. Poll the Files API and read processing_status (COMPLETED, INVALID_FORMAT, or FAILED) plus validation_report to detect dataset issues programmatically before launching a job, like missing role fields or malformed conversation turns.

    Errors include a user-facing reason, so you can fix the dataset and re-upload without trial-and-error training runs. For example:

    Line 7: messages[1] must contain a role field
    
    Original source
  • Jun 1, 2026
    • Date parsed from source:
      Jun 1, 2026
    • First seen by Releasebot:
      Jun 1, 2026
    • Modified by Releasebot:
      Jun 2, 2026
    Together AI logo

    Together AI

    June 1, 2026

    Together AI adds a Fine-tuning job metrics API for programmatic progress tracking, brings Slurm startup scripts to GPU Clusters, and improves Evaluations with a single-pass compare mode. It also updates billing documentation across payment methods, invoices, ACH, auto-recharge, and prepaid access.

    Fine-tuning job metrics API

    A new API endpoint, GET /fine-tunes/{id}/metrics, returns training metrics for a fine-tuning job (e.g. loss curves and other per-step values) so you can monitor progress programmatically without opening the dashboard. See the API reference and Fine tuning training metrics for details.

    Slurm startup scripts for GPU Clusters

    GPU clusters now support Slurm startup scripts (lifecycle hook scripts that run at node startup, job allocation, and job completion). Use them to install packages at boot, configure SSH sessions, or run per-job prolog and epilog actions across worker, login, and controller nodes. See Slurm startup scripts for details.

    Evaluations: Single-pass compare mode

    The compare evaluator now accepts a disable_position_bias_correction parameter. By default, the judge runs each comparison twice (A→B then B→A) and reconciles verdicts to cancel position bias. Setting disable_position_bias_correction to true runs a single pass, cutting judge cost and latency in half. See AI evaluations for details.

    Billing documentation updates

    Updated billing docs for multiple payment methods, separate invoice addresses, ACH payment behavior, auto-recharge limits with bank transfers, and prepaid-only access (no negative balance limits). See Payment methods & invoices, Credits, and Billing troubleshooting.

    Original source
  • May 29, 2026
    • Date parsed from source:
      May 29, 2026
    • First seen by Releasebot:
      May 30, 2026
    Together AI logo

    Together AI

    May 29, 2026

    Together AI updates serverless model pricing for Qwen and Meta Llama models, effective May 29, 2026.

    Pricing update

    The following models have updated pricing, effective May 29, 2026. All usage from that date forward will be billed at the new rates (per 1M tokens):

    Qwen/Qwen3.5-9B: $0.10 10.17 (input), $0.15 10.25 (output).

    meta-llama/Meta-Llama-3-8B-Instruct-Lite: $0.10 10.14 (input), $0.10 10.14 (output).

    meta-llama/Llama-3.3-70B-Instruct-Turbo: $0.88 1.04 (input), $0.88 1.04 (output).

    See Serverless models for the full pricing catalog.

    Original source
  • May 25, 2026
    • Date parsed from source:
      May 25, 2026
    • First seen by Releasebot:
      May 30, 2026
    Together AI logo

    Together AI

    May 25, 2026

    Together AI adds new serverless image and video models, expands dedicated endpoint support with Gemma, Llama, Qwen and other variants, and now offers a Seedance 2.0 quickstart for multimodal audio-video generation workflows.

    New serverless models

    The following image and video models are now available on serverless:

    Image

    ByteDance/Seedream-5.0-lite

    Video

    alibaba/happyhorse-1.0-i2v (image-to-video)

    alibaba/happyhorse-1.0-r2v (reference-to-video)

    google/veo-3.1

    google/veo-3.1-lite

    New dedicated endpoint models

    The following models are now available for deployment on dedicated endpoints:

    Gemma 3 (1B, 27B, 27B LoRA)

    Gemma 4 31B LoRA

    MedGemma 27B

    Molmo 7B

    Llama 3.2 3B Instruct

    Llama 4 Scout 17B FP8 LoRA

    Qwen 2/2.5/3 variants (14B, 32B, 235B A22B Instruct 2507 FP8, Qwen2-72B)

    Arcee Trinity Mini

    BGE Base EN v1.5

    MiniMax Speech 2.8 Turbo

    Rime Mist v3 (text and omni)

    Seedance 2.0 quickstart

    A quickstart is now available for Seedance 2.0, ByteDance’s unified multimodal audio-video generation model. The guide covers text-to-video, image-to-video, video extension, and instruction-based editing.

    Original source
  • May 27, 2026
    • Date parsed from source:
      May 27, 2026
    • First seen by Releasebot:
      May 27, 2026
    Together AI logo

    Together AI

    May 27, 2026

    Together AI deprecates black-forest-labs/FLUX.1-krea-dev on serverless and points users to migration options.

    Deprecations

    The following model has been deprecated and is no longer available on serverless:

    black-forest-labs/FLUX.1-krea-dev.

    See Deprecations for migration options.

    Original source
  • May 22, 2026
    • Date parsed from source:
      May 22, 2026
    • First seen by Releasebot:
      May 22, 2026
    Together AI logo

    Together AI

    May 22, 2026

    Together AI adds external OIDC authentication and RBAC for GPU clusters, letting team members access Kubernetes APIs with their organization’s SSO. It replaces shared kubeconfig credentials with per-user tokens, audit trails, and easier revocation, with support for Kubernetes clusters only.

    GPU Clusters: External OIDC authentication and RBAC

    GPU clusters now support external OpenID Connect (OIDC) authentication, allowing each team member to access the cluster’s Kubernetes API using their organization’s identity provider — Google, Okta, Auth0, Microsoft Entra ID, and others.

    With OIDC enabled, access is managed through standard Kubernetes RBAC: admins bind permissions to individual user identities, and each user authenticates via their browser using SSO. This replaces shared kubeconfig credentials with per-user tokens, per-user audit trails, and clean revocation. Currently this feature is only supported for Kubernetes clusters.

    OIDC must be configured at cluster creation time. See Set up OIDC authentication for the full setup guide.

    Original source
  • May 22, 2026
    • Date parsed from source:
      May 22, 2026
    • First seen by Releasebot:
      May 22, 2026
    Together AI logo

    Together AI

    May 22, 2026

    Together AI adds Qwen/Qwen3.7-Max to serverless models with new pricing for input and output tokens.

    New serverless models

    The following model has been added to serverless:

    Qwen/Qwen3.7-Max. Pricing: $2.50 input / $7.50 output (per 1M tokens).

    See Serverless models.

    Original source
  • Jun 4, 2026
    • Date parsed from source:
      Jun 4, 2026
    • First seen by Releasebot:
      May 22, 2026
    • Modified by Releasebot:
      Jun 16, 2026
    Together AI logo

    Together AI

    June 4, 2026

    Together AI deprecates Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 on serverless and points users to MiniMaxAI/MiniMax-M2.7.

    Model deprecations

    The following model has been deprecated and is no longer available on serverless:

    Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8. Recommended replacement: MiniMaxAI/MiniMax-M2.7, available as an on-demand dedicated endpoint.

    See Deprecations for migration options.

    Original source
  • May 21, 2026
    • Date parsed from source:
      May 21, 2026
    • First seen by Releasebot:
      May 22, 2026
    Together AI logo

    Together AI

    May 21, 2026

    Together AI deprecates moonshotai/Kimi-K2.5 on serverless and points users to migration options.

    Deprecations

    The following model has been deprecated and is no longer available on serverless:

    moonshotai/Kimi-K2.5.

    See Deprecations for migration options.

    Original source
Releasebot

Curated by the Releasebot team

Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.

Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.

Similar to Together AI with recent updates: