Fireworks AI Release Notes

Name: Fireworks AI
Brand: Fireworks AI

Follow Fireworks AI to add their release notes to your feed!

8 release notes curated from 1 source by the Releasebot Team. Last updated: Mar 20, 2026

Get this feed:

Feb 5, 2026
Date parsed from source:
Feb 5, 2026

First seen by Releasebot:
Mar 20, 2026

Modified by Releasebot:
May 1, 2026
Fireworks AI

2026-02-05

Fireworks AI adds multimodal video and audio input support, AWS S3 bucket integration for training datasets, and just-in-time SSO user provisioning for enterprise teams. It also updates docs across autoscaling, Responses API, custom models, and serverless tooling, plus bug fixes.
Video & Audio Input Models

You can now query multimodal models with video and audio inputs for video captioning, scene analysis, and multimodal question answering. Deploy models like Qwen3 Omni and Molmo2 to process video and audio content directly using the Chat Completions API.

See the Video & Audio Inputs guide for deployment instructions and code examples.

AWS S3 Integration for Training Datasets

Training datasets can now be stored in your own AWS S3 buckets using GCP-to-AWS OIDC federation. This Bring Your Own Bucket (BYOB) approach keeps your data private while enabling secure access during Supervised Fine-Tuning and Reinforcement Fine-Tuning jobs—no long-lived credentials required.

See the Secure Training (BYOB) documentation for IAM role setup and usage examples.

Just-In-Time (JIT) User Provisioning for SSO (Enterprise)

JIT user provisioning automatically creates user accounts when users sign in through SSO for the first time. Enable this when configuring your identity provider to eliminate manual user creation.

See the SSO documentation for setup instructions.

📚 Documentation Updates

Video & Audio Inputs: New guide for processing video and audio with Qwen3 Omni and Molmo2 models (Video & Audio Inputs)

AWS S3 Bucket Integration: BYOB dataset storage for training via OIDC federation (Secure Training)

Rate Limits Clarification: Expanded documentation on adaptive serverless limits, upper bounds, and monitoring (Rate Limits)

Anthropic-Compatible Thinking Parameter: Control reasoning with thinking parameter alongside reasoning_effort (Reasoning)

Scaling from Zero Behavior: Deployments scaled to zero return 503 immediately with retry guidance (Autoscaling)

Custom Model Configuration: Generation defaults via fireworks.json for LoRA adapters (Uploading Custom Models)

Service Account Roles: Role assignment and management for service accounts (Service Accounts)

Free RFT: Documentation highlighting free Reinforcement Fine-Tuning for models under 16B parameters

Serverless Model Retrieval: Python SDK and curl examples for programmatically listing serverless models (FAQ)

Function Tools in Responses API: Client-executed function tools following OpenAI-compatible format (Responses API)

Autoscaling Metric: New prompt_tokens_per_second load target for prefill-heavy workloads (Autoscaling)

firectl CLI Syntax: All documentation updated to firectl syntax

Bug Fixes & Minor Improvements
Original source
Jan 20, 2026
Date parsed from source:
Jan 20, 2026

First seen by Releasebot:
Mar 20, 2026
Fireworks AI

2026-01-20

Fireworks AI adds warm-start training for reinforcement fine-tuning and Azure federated identity support for model uploads, along with documentation updates and minor bug fixes.
Warm-Start Training and Azure Model Uploads

Warm-Start Training for Reinforcement Fine-Tuning

You can now warm-start Reinforcement Fine-Tuning jobs from previously supervised fine-tuned checkpoints using the --warm-start-from flag. This enables a streamlined SFT-to-RFT workflow where you first train a model with supervised fine-tuning, then continue training with reinforcement learning.

See the Warm-Start Training guide for details.

Azure Federated Identity for Model Uploads

Model uploads from Azure Blob Storage now support Azure AD federated identity authentication as an alternative to SAS tokens. This eliminates the need for credential rotation and enables secure, credential-less authentication.

See the Uploading Custom Models documentation for setup instructions.

📚 Documentation Updates

Warm-Start Training: New guide for SFT-to-RFT workflows (Warm-Start Training)

Azure Federated Identity: Setup instructions for Azure AD authentication (Uploading Custom Models)

Preserved Thinking: Multi-turn reasoning with preserved thinking context (Reasoning)

GLM 4.7: Added to models supporting reasoning_effort parameter

Bug Fixes & Minor Improvements
Original source
All of your release notes in one feed

Join Releasebot and get updates from Fireworks AI and hundreds of other software products.

Create account
Get updates with:
Dec 22, 2025
Date parsed from source:
Dec 22, 2025

First seen by Releasebot:
Mar 20, 2026
Fireworks AI

2025-12-22

Fireworks AI adds Playground category tabs, new Contributor and Inference roles, and major fine-tuning upgrades like stop and resume, job cloning, and log and dataset downloads. It also expands the Model Library with Gemma 3 and Qwen3 Omni models.
Playground Categories, New User Roles, Fine-Tuning Improvements, and New Models

Playground Categories

The Playground now features category tabs (LLM, Image, TTS, STT) in the header for easier switching between model types. The playground automatically detects the appropriate category based on the selected model and provides smart defaults for each category.

User Roles: Contributor and Inference

New user roles provide more granular access control for team collaboration:

Contributor: Read and write access to resources without administrative privileges

Inference: Read-only access with the ability to run inference on deployments

Assign these roles when inviting team members to provide appropriate access levels.

Fine-Tuning Improvements

Fine-tuning workflows have been enhanced with several new capabilities:

Stop and Resume Jobs: Stop running fine-tuning jobs and resume them later from where they left off. Available for Supervised Fine-Tuning and Reinforcement Fine-Tuning jobs.

Clone Jobs: Quickly create new fine-tuning jobs based on existing job configurations using the Clone action.

Download Output Datasets: Download output datasets from Reinforcement Fine-Tuning jobs, including individual files or bulk download as a ZIP archive.

Download Rollout Logs: Download rollout logs from Reinforcement Fine-Tuning jobs for offline analysis.

✨ New Models

Gemma 3 12B Instruct is now available in the Model Library

Gemma 3 4B Instruct is now available in the Model Library

Qwen3 Omni 30B A3B Instruct is now available in the Model Library

📚 Documentation Updates

Deployment Shapes API: Added List Deployment Shape Versions and Get Deployment Shape endpoints for querying available deployment shapes

Evaluator APIs: Added Create Evaluator, Update Evaluator, and helper endpoints for evaluator source code, build logs, and upload validation

Fine-Tuning APIs: Added Resume DPO Job, Resume Reinforcement Fine-Tuning Step, Execute Reinforcement Fine-Tuning Step, and Get Evaluation Job Log Endpoint

SDK Examples: Added Python SDK example links for direct routing and supervised fine-tuning workflows

Bug Fixes & Minor Improvements
Original source
Dec 15, 2025
Date parsed from source:
Dec 15, 2025

First seen by Releasebot:
Mar 20, 2026
Fireworks AI

2025-12-15

Fireworks AI releases a major update with a new Reasoning guide, expanded prompt caching guidance, and new model additions in the Model Library. It also refreshes compatibility, eval API, and firectl CLI docs, plus bug fixes and minor improvements.
Reasoning Guide, Prompt Caching Updates, New Models and CLI Updates

Reasoning Guide

A new Reasoning guide is now available in the documentation. This comprehensive guide covers:

Accessing reasoning_content from thinking/reasoning models

Controlling reasoning effort with the reasoning_effort parameter

Streaming with reasoning content

Interleaved thinking for multi-step tool-calling workflows

The guide provides code examples using the Fireworks Python SDK and explains how to work with models that support extended reasoning capabilities.

Prompt Caching Updates

Prompt caching documentation has been updated with expanded guidance:

Cached prompt tokens on serverless now cost 50% less than uncached tokens

Session affinity routing via the user field or x-session-affinity header for improved cache hit rates

Prompt optimization techniques for maximizing cache efficiency

See the Prompt Caching guide for details.

✨ New Models

Devstral Small 2 24B Instruct 2512 is now available in the Model Library

NVIDIA Nemotron Nano 3 30B A3B is now available in the Model Library

📚 Documentation Updates

Reasoning Guide: New documentation for working with reasoning models, including reasoning_content, reasoning_effort, streaming, and interleaved thinking (Reasoning)

Recommended Models: Updated recommendations to include DeepSeek V3.2 for code generation and Kimi K2 Thinking as a GPT-5 alternative (Recommended Models)

OpenAI Compatibility: Removed stop sequence documentation as Fireworks is now 1:1 compatible with OpenAI’s behavior (OpenAI Compatibility)

Evaluator APIs: Added REST API documentation for Evaluator and Evaluation Job CRUD operations (Evals API Reference)

firectl CLI Reference: Updated with new commands including cancel dpo-job, cancel supervised-fine-tuning-job, set-api-key, redeem-credit-code, and evaluator revision management

Bug Fixes & Minor Improvements
Original source
Dec 8, 2025
Date parsed from source:
Dec 8, 2025

First seen by Releasebot:
Mar 20, 2026
Fireworks AI

2025-12-08

Fireworks AI adds DeepSeek V3.2 on serverless, cached token pricing in the Model Library, and new evaluations dashboard filtering and status tracking. It also expands the Model Library with several new models, updates reranking docs, and includes bug fixes and minor improvements.
DeepSeek V3.2 on Serverless, Cached Token Pricing, and New Models

☁️ Serverless

DeepSeek V3.2 is now available on serverless

Cached Token Pricing Display

The Model Library and model detail pages now display cached and uncached input token pricing for serverless models that support prompt caching. This gives you better visibility into potential cost savings when using prompt caching with supported models.

Evaluations Dashboard Improvements

The Evaluations dashboard has been enhanced with new filtering and status tracking capabilities:

Status column showing evaluator build state (Active, Building, Failed)

Quick filters to filter evaluators and evaluation jobs by status

Improved table layout with actions integrated into the status column

✨ New Models

DeepSeek V3.2 is now available in the Model Library

Ministral 3 14B Instruct 2512 is now available in the Model Library

Ministral 3 8B Instruct 2512 is now available in the Model Library

Ministral 3 3B Instruct 2512 is now available in the Model Library

Mistral Large 3 675B Instruct is now available in the Model Library

Qwen3-VL-32B-Instruct is now available in the Model Library

Qwen3-VL-8B-Instruct is now available in the Model Library

📚 Documentation Updates

Reranking Guide: Added documentation for using the /rerank endpoint and /embeddings endpoint with return_logits for reranking, including parallel batching examples (Querying Embeddings Models)

Bug Fixes & Minor Improvements
Original source
Dec 1, 2025
Date parsed from source:
Dec 1, 2025

First seen by Releasebot:
Mar 20, 2026
Fireworks AI

2025-12-01

Fireworks AI adds audit logs and dataset downloads in the web app, expands Reinforcement Fine-Tuning with weighted training, and adds KAT Coder to the Model Library, alongside bug fixes and minor improvements.
Audit Logs, Dataset Download, Weighted Training for Reinforcement Fine-Tuning, and New Model

Audit Logs in Web App

You can now view and search audit logs directly from the Fireworks web app. The new Audit Logs page provides:

Search and filter logs by status and timeframe

Detailed view panel for individual log entries

Easy navigation from the console sidebar under Account settings

See the Audit Logs documentation for more information.

Dataset Download

You can now download datasets directly from the Fireworks web app. The new download functionality allows you to:

Download individual files from a dataset

Download all files at once with “Download All”

Access downloads from the Datasets table in the dashboard

Weighted Training for Reinforcement Fine-Tuning

Reinforcement Fine-Tuning now supports per-example weighting, giving you more control over which samples have greater influence during training. This feature mirrors the weighted training functionality already available in Supervised Fine-Tuning.

See the Weighted Training documentation for details on the weight field format.

✨ New Models

KAT Coder is now available in the Model Library

Bug Fixes & Minor Improvements
Original source
Nov 24, 2025
Date parsed from source:
Nov 24, 2025

First seen by Releasebot:
Mar 20, 2026
Fireworks AI

2025-11-24

Fireworks AI improves evaluator creation with GitHub template support and a new sortable table, adds MLOps and observability integrations for W&B and MLflow, expands the Model Library with Kimi K2 Thinking and KAT Dev models, and introduces new REST API endpoints for fine-tuning and deployment management.
Evaluator Improvements, Kimi K2 Thinking on Serverless, and New API Endpoints

Improved Evaluator Creation Experience

The evaluator creation workflow has been significantly enhanced with GitHub template integration. You can now:

Fork evaluator templates directly from GitHub repositories

Browse and preview templates before using them

Create evaluators with a streamlined save dialog

View evaluators in a new sortable and paginated table

MLOps & Observability Integrations

New documentation for integrating Fireworks with MLOps and observability tools:

Weights & Biases (W&B) integration for experiment tracking during fine-tuning

MLflow integration for model management and experiment logging

✨ New Models

Kimi K2 Thinking is now available in the Model Library

KAT Dev 32B is now available in the Model Library

KAT Dev 72B Exp is now available in the Model Library

☁️ Serverless

Kimi K2 Thinking is now available on serverless

📚 New REST API Endpoints

New REST API endpoints are now available for managing Reinforcement Fine-Tuning Steps and deployments:

Create Reinforcement Fine-Tuning Step

List Reinforcement Fine-Tuning Steps

Get Reinforcement Fine-Tuning Step

Delete Reinforcement Fine-Tuning Step

Scale Deployment

List Deployment Shape Versions

Get Deployment Shape Version

Get Dataset Download Endpoint

Bug Fixes & Minor Improvements
Original source
Nov 12, 2025
Date parsed from source:
Nov 12, 2025

First seen by Releasebot:
Mar 20, 2026
Fireworks AI

2025-11-12

Fireworks AI introduces a new Python SDK for the Build experience, replacing the deprecated Build SDK, and improves RFT with better reliability, multi-turn training, stronger observability, and a more developer-friendly workflow.

☀️ Sunsetting Build SDK

The Build SDK is being deprecated in favor of a new Python SDK generated directly from our REST API. The new SDK is more up-to-date, flexible, and continuously synchronized with our REST API. Please note that the last version of the Build SDK will be 0.19.20, and the new SDK will start at 1.0.0. Python package managers will not automatically update to the new SDK, so you will need to manually update your dependencies and refactor your code.

Existing codebases using the Build SDK will continue to function as before and will not be affected unless you choose to upgrade to the new SDK version.

The new SDK replaces the Build SDK’s LLM and Dataset classes with REST API-aligned methods. If you upgrade to version 1.0.0 or later, you will need to migrate your code.

🚀 Improved RFT Experience

We’ve drastically improved the RFT experience with better reliability, developer-friendly SDK for hooking up your existing agents, support for multi-turn training, better observability in our Web App, and better overall developer experience.

See Reinforcement Fine-Tuning for more details.
Original source

This is the end. You've seen all the release notes in this feed!

Curated by the Releasebot team

Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.

Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.

About us Our methodology

Fireworks AI Release Notes

2026-02-05

Video & Audio Input Models

AWS S3 Integration for Training Datasets

Just-In-Time (JIT) User Provisioning for SSO (Enterprise)

📚 Documentation Updates

Bug Fixes & Minor Improvements

2026-01-20

Warm-Start Training and Azure Model Uploads

Warm-Start Training for Reinforcement Fine-Tuning

Azure Federated Identity for Model Uploads

📚 Documentation Updates

Bug Fixes & Minor Improvements

2025-12-22

Playground Categories, New User Roles, Fine-Tuning Improvements, and New Models

Playground Categories

User Roles: Contributor and Inference

Fine-Tuning Improvements

✨ New Models

📚 Documentation Updates

Bug Fixes & Minor Improvements

2025-12-15

Reasoning Guide, Prompt Caching Updates, New Models and CLI Updates

Reasoning Guide

Prompt Caching Updates

✨ New Models

📚 Documentation Updates

Bug Fixes & Minor Improvements

2025-12-08

DeepSeek V3.2 on Serverless, Cached Token Pricing, and New Models

☁️ Serverless

Cached Token Pricing Display

Evaluations Dashboard Improvements

✨ New Models

📚 Documentation Updates

Bug Fixes & Minor Improvements

2025-12-01

Audit Logs, Dataset Download, Weighted Training for Reinforcement Fine-Tuning, and New Model

Audit Logs in Web App

Dataset Download

Weighted Training for Reinforcement Fine-Tuning

✨ New Models

Bug Fixes & Minor Improvements

2025-11-24

Evaluator Improvements, Kimi K2 Thinking on Serverless, and New API Endpoints

Improved Evaluator Creation Experience

MLOps & Observability Integrations

✨ New Models

☁️ Serverless

📚 New REST API Endpoints

Bug Fixes & Minor Improvements

2025-11-12

☀️ Sunsetting Build SDK

🚀 Improved RFT Experience

Curated by the Releasebot team

Similar to Fireworks AI with recent updates: