Replicate Release Notes

Last updated: Apr 22, 2026

Get this feed:
  • Apr 21, 2026
    • Date parsed from source:
      Apr 21, 2026
    • First seen by Releasebot:
      Apr 22, 2026
    Replicate logo

    Replicate

    Agent skills for Replicate

    Replicate now publishes agent skills for coding assistants, adding markdown guidance for model discovery, comparison, API execution, and better image and video prompting. The skills work with Claude Code, OpenCode, OpenAI Codex, and other compatible tools.

    Replicate now publishes agent skills, a collection of markdown instruction files that give coding assistants expert knowledge about working with AI models on Replicate.

    Skills cover model discovery, comparison, and execution via the API, along with detailed prompting techniques for image generation and video generation models. They follow the open Agent Skills spec and work with Claude Code, OpenCode, OpenAI Codex, and other compatible tools.

    Install

    npx skills add replicate/skills
    

    This installs all of Replicate’s skills into your project and configures them for your coding assistant automatically.

    Skills and MCP

    Skills are complementary to Replicate’s MCP server. MCP gives your coding assistant API tools. Skills give it knowledge about how to use those tools well: which models to choose, how to write prompts, and what tradeoffs to consider.

    For more details, see the agent skills reference or the GitHub repository.

    Original source
  • Mar 2, 2026
    • Date parsed from source:
      Mar 2, 2026
    • First seen by Releasebot:
      Mar 2, 2026
    Replicate logo

    Replicate

    Fallback model for Nano Banana Pro

    Nano Banana Pro now falls back to Seedream 5.0 lite when Google's API is rate limited, instead of failing. Enable with allow_fallback_model; on rate limits it uses the fallback and marks the output as fallback. Note limits: no 1K or 4K, and no 4:5 or 5:4 aspect ratios; cost applies.

    How it works

    Set allow_fallback_model to true when calling the API. If Nano Banana Pro hits a rate limit, it tries to generate the image with Seedream 5.0 lite instead. For certain inputs, for example if the aspect ratio isn’t supported, the original rate limit error is returned.

    The fallback is off by default. If you don’t set allow_fallback_model, nothing changes — you’ll get a rate limit error when Google’s API is at capacity.

    When the fallback is triggered, your logs still show a prediction to Nano Banana Pro. You can tell the fallback was used by checking the resolution field in your output — it says "fallback" instead of the actual resolution. You’re charged the cost of the fallback model, not Nano Banana Pro.

    Limitations

    Our current fallback model, Seedream 5.0 lite, doesn’t support all the same options as Nano Banana Pro:

    • Seedream 5.0 lite doesn’t support 1K resolution. If you request 1K, the fallback generates at 2K and downscales the result.
    • Seedream 5.0 lite doesn’t support 4K resolution. If you request 4K, the fallback won’t be used and the original rate limit error is returned.
    • Seedream 5.0 lite doesn’t support the 4:5 and 5:4 aspect ratios. Requests with these ratios won’t fall back and will return the original rate limit error.
    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from Replicate and hundreds of other software products.

    Create account
  • Feb 10, 2026
    • Date parsed from source:
      Feb 10, 2026
    • First seen by Releasebot:
      Feb 11, 2026
    Replicate logo

    Replicate

    MCP server auto-discovery

    Replicate’s MCP server now supports automatic discovery via the official MCP Registry with a new /.well-known/mcp/server.json endpoint. The Registry holds metadata to guide MCP clients to install servers, enabling built‑in discovery in select clients like VS Code, plus a --tools flag for standard or code mode.

    MCP server discovery

    Replicate’s MCP server can now be discovered automatically through the official MCP Registry.
    We added a /.well-known/mcp/server.json endpoint that publishes metadata about the MCP server. This follows the server.json specification from the Model Context Protocol.

    How discovery works

    The MCP Registry is the official metadata repository for MCP servers, backed by Anthropic, GitHub, and Microsoft. It doesn’t host code—just metadata that describes where to find servers and how to install them.
    When you publish a server.json file at /.well-known/mcp/server.json, the Registry can discover your server automatically. MCP clients then use the Registry to find and install servers.

    Clients with built-in discovery

    A few MCP clients have built-in marketplaces or directories:

    • VS Code has the best Registry integration. Enable chat.mcp.gallery.enabled in your settings, then search @mcp in the Extensions view to browse and install MCP servers.
    • Claude Desktop has a curated extensions directory at Settings > Extensions > Browse extensions.
      Other clients like ChatGPT, Cursor, and LM Studio require manual configuration—you add the server URL or edit a config file yourself.

    Code mode option

    The metadata also exposes the --tools flag, which lets you choose between standard tools (all) or code mode (code) when installing.

    Original source
  • Jan 14, 2026
    • Date parsed from source:
      Jan 14, 2026
    • First seen by Releasebot:
      Jan 14, 2026
    Replicate logo

    Replicate

    Filter predictions by source

    Filter list predictions API by source

    You can now filter the list predictions API endpoint to show only predictions created through the web interface.

    Use the source query parameter with a value of web :

    curl -s \
    -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
    "https://api.replicate.com/v1/predictions?source=web"
    

    This is useful if you want to see predictions you created using the playground or other parts of the Replicate website, separate from predictions created programmatically via the API.
    Note: When filtering by source=web , results are limited to predictions from the last 14 days.

    Original source
  • Dec 19, 2025
    • Date parsed from source:
      Dec 19, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    The little things, week ending December 19, 2025

    Web

    • Improved the reliability of google/nano-banana and google/nano-banana-pro

    • Improved accessibility when using the search bar across Replicate

    Docs

    • Added automatic llms.txt generation for documentation, making it easier for language models to discover and understand Replicate’s docs

    • Published blog post on how to run Retro Diffusion’s pixel art models on Replicate, including rd-fast, rd-plus, rd-tile, and rd-animation for generating game assets and sprites

    Original source
  • Dec 5, 2025
    • Date parsed from source:
      Dec 5, 2025
    • First seen by Releasebot:
      Dec 15, 2025
    Replicate logo

    Replicate

    The little things, week ending December 5, 2025

    FLUX.2 becomes the default in the playground with new FLUX.2 models; pricing messaging is clearer; homepage hero and mobile UX improved; fixups across explore pages and navigation; data retention banner clarified; plus release blog posts on FLUX.2 and Isaac 0.1.

    Web

    • Added created and last updated dates on model pages
    • Improved pricing display on model pages with clearer messaging for bring-your-own-token models and better visual consistency for single and multiple pricing tiers
    • Made FLUX.2 the default model in the playground
    • Added FLUX.2 models (pro, flex, and dev) to the playground
    • Improved homepage hero section with clickable hero images that link to model pages, better mobile responsiveness, and reduced bundle size
    • Improved mobile homepage hero legibility by adjusting gradient overlays
    • Fixed model card text truncation and spacing on the explore page at smaller viewport sizes
    • Updated playground to better adhere to platform rate limits
    • Improved data retention banner messaging to clarify that failed predictions are not retained indefinitely
    • Fixed support form reliability issues
    • Fixed an issue where the navigation may have rendered incorrectly on mobile devices

    Docs

    • Published blog post on how to run FLUX.2 on Replicate, Black Forest Lab’s most advanced image generation model
    • Published blog post on how to run Isaac 0.1 on Replicate, an open-weight vision-language model
    Original source
  • Nov 21, 2025
    • Date parsed from source:
      Nov 21, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    The little things, week ending November 21, 2025

    Replicate introduces cost displays for predictions, a refreshed homepage, better code snippet rendering, and smarter site search with keyboard shortcuts. Nano Banana Pro is now the default playground model, Cog gets a v0.16.9 bug fix, and a new blog post explains prompting Nano Banana Pro.

    Web

    • Added approximate cost display to predictions and trainings on dashboard and predictions pages, showing how much each run costs
    • Launched an updated homepage hero section and navigation
    • Improved display of code snippets on the Replicate homepage
    • Made Nano Banana Pro the default model in the playground
    • Improved site search sorting, and improved keyboard shortcut handling

    Cog

    • Released Cog v0.16.9 with a fix for x-order bug

    Docs

    • Published blog post on how to prompt Nano Banana Pro with guidance on using its logic, text rendering, character consistency, and world knowledge capabilities
    Original source
  • Nov 15, 2025
    • Date parsed from source:
      Nov 15, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    Code mode for Replicate's MCP server

    Replicate’s local MCP server adds experimental code mode letting models write and run TypeScript in a sandbox. It includes a docs search tool and a TypeScript executor via the Replicate SDK in a Deno sandbox for complex workflows. Enable with --tools=code; Node.js/Deno needed; remote sandboxing planned.

    Code mode

    Replicate’s local MCP server now supports an experimental “code mode” that allows language models to write and execute TypeScript code directly in a sandboxed environment.

    Instead of exposing individual API operations as separate tools, Code mode provides two tools: one for searching SDK documentation, and another for executing TypeScript code using the Replicate SDK within a Deno sandbox. The model uses a built-in docs search tool to learn how to write code against the SDK. This approach is more efficient for complex workflows that involve multiple API calls, as it reduces context window usage and allows the model to write custom logic that calls multiple methods and returns only the final results.

    To use code mode, start the local MCP server with the --code-mode flag:

    npx -y replicate-mcp@alpha --tools=code
    

    Here’s how to add the local Replicate MCP server in code mode as a tool in Claude Code:

    claude mcp add "replicate-code-mode" --scope user --transport stdio -- npx -y replicate-mcp@alpha --tools=code
    

    Code mode is currently experimental and subject to change. It requires Node.js and Deno to be installed locally. Remote cloud sandboxing support is planned but not yet available.

    To get started with Code mode, see the Code mode documentation or visit the demo GitHub repo.

    Original source
  • Nov 7, 2025
    • Date parsed from source:
      Nov 7, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    • Modified by Releasebot:
      Jan 14, 2026
    Replicate logo

    Replicate

    The little things, week ending November 7, 2025

    Replicate rolls out multimedia upgrades with video support in before/after sliders, pixelated image rendering with size controls, improved aspect ratios, and a UI focus overhaul. Deployment search added, input-order bug fixed, plus enterprise deployment monitoring and new docs posts.

    Web

    • Added video support to the before/after slider, allowing side-by-side comparison of video outputs from models like video upscaling and style transfer
    • Added pixelated image rendering with size controls (1x, 2x, fit) in the playground for pixel art models
    • Improved support for aspect ratios in prediction outputs
    • Overhauled focus states across the Replicate UI library
    • Add search to Deployments
    • Fixed a bug that caused model input fields to display in a different order
    • Added FAQs to every collection page

    Platform

    • Launched deployment setup monitoring for enterprise customers with automatic email notifications when deployments fail setup and customizable setup timeouts

    Docs

    • Published blog post on how to run FLUX.2 on Replicate, Black Forest Lab’s most advanced image generation model
    • Published blog post on how to run Isaac 0.1 on Replicate, an open-weight vision-language model
    Original source
  • Nov 3, 2025
    • Date parsed from source:
      Nov 3, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    Deployment setup monitoring

    Enterprise customers gain deployment controls with automatic failure notifications and customizable setup timeouts. Get emailed alerts or Slack integrations for deployment issues, plus adjustable model setup timeout to fit large downloads or heavy initialization.

    Enterprise deployment controls

    Setup failure notifications

    If a deployment fails during the model’s setup function, we’ll notify you via email. This helps you catch issues with your model earlier, before your users reach out.

    If you use Slack, you can configure these emails to be sent to a Slack channel using their send emails to Slack feature.

    Custom setup timeouts

    You can customize the timeout for your deployment’s model setup function. The default timeout is 10 minutes, which works for most models. But if your model needs to download large files, load trained weights, or perform other expensive initialization operations, you can give it more time before we mark it as failed.

    You can configure both of these in your deployment settings.

    Read more in our deployment monitoring docs or reach out to [email protected] to learn about our enterprise plans.

    Original source
  • Oct 24, 2025
    • Date parsed from source:
      Oct 24, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    The little things, week ending October 24, 2025

    Veo releases a public beta Python SDK to run AI models from Python with full HTTP API support. Web, API and Docs get improvements from output handling and keyboard shortcuts to prediction deadlines, source tracking, and a new deployment monitoring docs section.

    SDK

    • Our Python SDK public beta is out now, making it even easier to run AI models from your Python code with full support for every operation in our HTTP API

    Web

    • Improved playground handling of different output and streaming types
    • Fixed search keyboard shortcut in docs to use Cmd+/ on Mac and Ctrl+/ on Windows for consistency across platforms

    API

    • Launched prediction deadlines so you can automatically cancel predictions that don’t complete within a specified duration
    • Added source field to prediction API responses to indicate whether predictions were created via web or api

    Docs

    • Split deployment monitoring into its own docs section with detailed information about metrics, GPU memory monitoring, and performance tracking
    • Published blog post about how to prompt Veo 3.1 with guidance on reference images, frame control, and image-to-video features
    Original source
  • Oct 16, 2025
    • Date parsed from source:
      Oct 16, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    Set deadlines for predictions

    Replicate introduces prediction deadlines with Cancel-After to auto cancel stalled predictions, perfect for real time experiences like virtual try-ons. Configure via the Cancel-After header and optional Prefer: wait to control response time. Costs are limited to canceled predictions.

    How it works

    You can now set a deadline to automatically cancel a prediction if it doesn’t complete within a specified duration. This is useful when you’re building real-time or interactive experiences, like a virtual try-on experience for an online clothing store. In this case, shoppers have usually moved on if an image takes more than 15 seconds to generate.

    Set a deadline by including a Cancel-After header when creating a prediction. See our docs for details on the header format.

    Here’s an example that sets a 1 minute deadline:

    curl -X POST \
    -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
    -H "Cancel-After: 1m" \
    -H "Prefer: wait" \
    -H "Content-Type: application/json" \
    -d $'{
      "input": {
        "prompt": "The sun rises slowly between tall buildings. [Ground-level follow shot] Bicycle tires roll over a dew-covered street at dawn. The cyclist passes through dappled light under a bridge as the entire city gradually wakes up."
      }
    }' https://api.replicate.com/v1/models/bytedance/seedance-1-pro/predictions
    

    What happens when a deadline is reached

    Replicate sets the prediction’s status to aborted if the deadline is reached before it starts running, and canceled if the deadline is reached while it’s running.

    For public models, you’re only charged for predictions with a canceled status, not for aborted ones.

    Deadline vs sync mode wait duration

    Prediction deadlines and sync mode serve different purposes. Use prediction deadline (Cancel-After header) to control when the prediction itself should be canceled. Use sync mode (Prefer: wait header) to control how long the HTTP request stays open waiting for results.

    You can also use both together. In the previous cURL example, Prefer: wait defaults to 1 min and we’ve explicitly set Cancel-After to 1 min. This means that the HTTP request will stay open for 1 minute to wait for results, after which the prediction will be canceled, even if it has not completed.

    Alternatively, setting Cancel-After: 1m and Prefer: wait=10 means that the request returns after 10 seconds. If the prediction is still running, you’ll get an incomplete prediction object, and the prediction will continue to run until it completes or is canceled at the 1-minute deadline.

    Read more in the docs:

    • Create a prediction: Prediction deadlines
    • Prediction lifecycle: Timeouts
    Original source
  • Oct 10, 2025
    • Date parsed from source:
      Oct 10, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    The little things, week ending October 10, 2025

    Replicate unveils a speed boost with faster performance and smarter filtering plus a refreshed UI. New API edits for model metadata, clickable Official label, invoices download, plus expanded docs and Cog v0.16.8 with registry support and upgrades. Clear release of user facing improvements.

    Web

    • Improved speed and performance across Replicate
    • Updated dashboard UI with new navigation components for improved consistency
    • Improved collection API filtering for better performance
    • Made the Official label on model pages clickable, linking to the official models documentation

    API

    • Launched the ability to update model metadata via API with PATCH requests to update descriptions, README content, and links
    • Added sorting options to the models.list API to sort by model creation date or latest version date

    Docs

    • Expanded docs sidebar by default to make navigation easier
    • Added comprehensive documentation about rate limiting when you have no payment method
    • Published a blog post about IBM Granite 4.0 models now available on Replicate
    • Updated getting started guides

    Platform

    • Added ability to download invoices from billing settings for both monthly billing and credit purchases

    Cog

    • Released Cog v0.16.8 with registry migration support and credential fallback
    • Updated FastAPI requirement to support versions up to 0.119.0
    • Fixed Go build issues with version control information injection
    • Upgraded test dependencies including TensorFlow updates
    Original source
  • Oct 8, 2025
    • Date parsed from source:
      Oct 8, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    Sort models by creation date via API

    API changes

    The GET /v1/models endpoint now supports sorting with the sort_by and sort_direction query parameters. This makes it easier to fetch the newest models via the API.

    Sorting options

    • model_created_at: Sort by when the model was first created
    • latest_version_created_at: Sort by when the model’s latest version was created (default)

    Sort direction can be asc (ascending, oldest first) or desc (descending, newest first).

    The default behavior remains unchanged: models are sorted by latest_version_created_at in descending order (newest versions first).

    Original source
  • Oct 6, 2025
    • Date parsed from source:
      Oct 6, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    Update model metadata via API

    API now supports PATCH to update model metadata like description, readme, and URLs for GitHub, paper, weights, and license. This enables direct, in‑app management of model details and documentation for faster integration.

    Update model properties via PATCH

    You can now update model properties using the API with a PATCH request to /v1/models/{owner}/{name}.

    You can update the following properties:

    • description - Model description
    • readme - Model README content
    • github_url - GitHub repository URL
    • paper_url - Research paper URL
    • weights_url - Model weights URL
    • license_url - License URL

    Example cURL request:

    curl -X PATCH \
      https://api.replicate.com/v1/models/your-username/your-model-name \
      -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
      -H "Content-Type: application/json" \
      -d '{
        "description": "Detect hot dogs in images",
        "readme": "# Hot Dog Detector\n\n🌭 Ketchup, mustard, and onions...",
        "github_url": "https://github.com/alice/hot-dog-detector",
        "paper_url": "https://arxiv.org/abs/2504.17639",
        "weights_url": "https://huggingface.co/alice/hot-dog-detector",
        "license_url": "https://choosealicense.com/licenses/mit/"
      }'
    

    See the API reference for full details.

    Original source

Related vendors