Replicate Release Notes

Last updated: Mar 2, 2026

  • Mar 2, 2026
    • Date parsed from source:
      Mar 2, 2026
    • First seen by Releasebot:
      Mar 2, 2026
    Replicate logo

    Replicate

    Fallback model for Nano Banana Pro

    Nano Banana Pro now falls back to Seedream 5.0 lite when Google's API is rate limited, instead of failing. Enable with allow_fallback_model; on rate limits it uses the fallback and marks the output as fallback. Note limits: no 1K or 4K, and no 4:5 or 5:4 aspect ratios; cost applies.

    How it works

    Set allow_fallback_model to true when calling the API. If Nano Banana Pro hits a rate limit, it tries to generate the image with Seedream 5.0 lite instead. For certain inputs, for example if the aspect ratio isn’t supported, the original rate limit error is returned.

    The fallback is off by default. If you don’t set allow_fallback_model, nothing changes — you’ll get a rate limit error when Google’s API is at capacity.

    When the fallback is triggered, your logs still show a prediction to Nano Banana Pro. You can tell the fallback was used by checking the resolution field in your output — it says "fallback" instead of the actual resolution. You’re charged the cost of the fallback model, not Nano Banana Pro.

    Limitations

    Our current fallback model, Seedream 5.0 lite, doesn’t support all the same options as Nano Banana Pro:

    • Seedream 5.0 lite doesn’t support 1K resolution. If you request 1K, the fallback generates at 2K and downscales the result.
    • Seedream 5.0 lite doesn’t support 4K resolution. If you request 4K, the fallback won’t be used and the original rate limit error is returned.
    • Seedream 5.0 lite doesn’t support the 4:5 and 5:4 aspect ratios. Requests with these ratios won’t fall back and will return the original rate limit error.
    Original source Report a problem
  • Feb 10, 2026
    • Date parsed from source:
      Feb 10, 2026
    • First seen by Releasebot:
      Feb 11, 2026
    Replicate logo

    Replicate

    MCP server auto-discovery

    Replicate’s MCP server now supports automatic discovery via the official MCP Registry with a new /.well-known/mcp/server.json endpoint. The Registry holds metadata to guide MCP clients to install servers, enabling built‑in discovery in select clients like VS Code, plus a --tools flag for standard or code mode.

    MCP server discovery

    Replicate’s MCP server can now be discovered automatically through the official MCP Registry.
    We added a /.well-known/mcp/server.json endpoint that publishes metadata about the MCP server. This follows the server.json specification from the Model Context Protocol.

    How discovery works

    The MCP Registry is the official metadata repository for MCP servers, backed by Anthropic, GitHub, and Microsoft. It doesn’t host code—just metadata that describes where to find servers and how to install them.
    When you publish a server.json file at /.well-known/mcp/server.json, the Registry can discover your server automatically. MCP clients then use the Registry to find and install servers.

    Clients with built-in discovery

    A few MCP clients have built-in marketplaces or directories:

    • VS Code has the best Registry integration. Enable chat.mcp.gallery.enabled in your settings, then search @mcp in the Extensions view to browse and install MCP servers.
    • Claude Desktop has a curated extensions directory at Settings > Extensions > Browse extensions.
      Other clients like ChatGPT, Cursor, and LM Studio require manual configuration—you add the server URL or edit a config file yourself.

    Code mode option

    The metadata also exposes the --tools flag, which lets you choose between standard tools (all) or code mode (code) when installing.

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from Replicate and hundreds of other software products.

  • Jan 14, 2026
    • Date parsed from source:
      Jan 14, 2026
    • First seen by Releasebot:
      Jan 14, 2026
    Replicate logo

    Replicate

    Filter predictions by source

    Filter list predictions API by source

    You can now filter the list predictions API endpoint to show only predictions created through the web interface.

    Use the source query parameter with a value of web :

    curl -s \
    -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
    "https://api.replicate.com/v1/predictions?source=web"
    

    This is useful if you want to see predictions you created using the playground or other parts of the Replicate website, separate from predictions created programmatically via the API.
    Note: When filtering by source=web , results are limited to predictions from the last 14 days.

    Original source Report a problem
  • Dec 19, 2025
    • Date parsed from source:
      Dec 19, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    The little things, week ending December 19, 2025

    Web

    • Improved the reliability of google/nano-banana and google/nano-banana-pro

    • Improved accessibility when using the search bar across Replicate

    Docs

    • Added automatic llms.txt generation for documentation, making it easier for language models to discover and understand Replicate’s docs

    • Published blog post on how to run Retro Diffusion’s pixel art models on Replicate, including rd-fast, rd-plus, rd-tile, and rd-animation for generating game assets and sprites

    Original source Report a problem
  • Dec 5, 2025
    • Date parsed from source:
      Dec 5, 2025
    • First seen by Releasebot:
      Dec 15, 2025
    Replicate logo

    Replicate

    The little things, week ending December 5, 2025

    FLUX.2 becomes the default in the playground with new FLUX.2 models; pricing messaging is clearer; homepage hero and mobile UX improved; fixups across explore pages and navigation; data retention banner clarified; plus release blog posts on FLUX.2 and Isaac 0.1.

    Web

    • Added created and last updated dates on model pages
    • Improved pricing display on model pages with clearer messaging for bring-your-own-token models and better visual consistency for single and multiple pricing tiers
    • Made FLUX.2 the default model in the playground
    • Added FLUX.2 models (pro, flex, and dev) to the playground
    • Improved homepage hero section with clickable hero images that link to model pages, better mobile responsiveness, and reduced bundle size
    • Improved mobile homepage hero legibility by adjusting gradient overlays
    • Fixed model card text truncation and spacing on the explore page at smaller viewport sizes
    • Updated playground to better adhere to platform rate limits
    • Improved data retention banner messaging to clarify that failed predictions are not retained indefinitely
    • Fixed support form reliability issues
    • Fixed an issue where the navigation may have rendered incorrectly on mobile devices

    Docs

    • Published blog post on how to run FLUX.2 on Replicate, Black Forest Lab’s most advanced image generation model
    • Published blog post on how to run Isaac 0.1 on Replicate, an open-weight vision-language model
    Original source Report a problem
  • Nov 21, 2025
    • Date parsed from source:
      Nov 21, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    The little things, week ending November 21, 2025

    Replicate introduces cost displays for predictions, a refreshed homepage, better code snippet rendering, and smarter site search with keyboard shortcuts. Nano Banana Pro is now the default playground model, Cog gets a v0.16.9 bug fix, and a new blog post explains prompting Nano Banana Pro.

    Web

    • Added approximate cost display to predictions and trainings on dashboard and predictions pages, showing how much each run costs
    • Launched an updated homepage hero section and navigation
    • Improved display of code snippets on the Replicate homepage
    • Made Nano Banana Pro the default model in the playground
    • Improved site search sorting, and improved keyboard shortcut handling

    Cog

    • Released Cog v0.16.9 with a fix for x-order bug

    Docs

    • Published blog post on how to prompt Nano Banana Pro with guidance on using its logic, text rendering, character consistency, and world knowledge capabilities
    Original source Report a problem
  • Nov 15, 2025
    • Date parsed from source:
      Nov 15, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    Code mode for Replicate's MCP server

    Replicate’s local MCP server adds experimental code mode letting models write and run TypeScript in a sandbox. It includes a docs search tool and a TypeScript executor via the Replicate SDK in a Deno sandbox for complex workflows. Enable with --tools=code; Node.js/Deno needed; remote sandboxing planned.

    Code mode

    Replicate’s local MCP server now supports an experimental “code mode” that allows language models to write and execute TypeScript code directly in a sandboxed environment.

    Instead of exposing individual API operations as separate tools, Code mode provides two tools: one for searching SDK documentation, and another for executing TypeScript code using the Replicate SDK within a Deno sandbox. The model uses a built-in docs search tool to learn how to write code against the SDK. This approach is more efficient for complex workflows that involve multiple API calls, as it reduces context window usage and allows the model to write custom logic that calls multiple methods and returns only the final results.

    To use code mode, start the local MCP server with the --code-mode flag:

    npx -y replicate-mcp@alpha --tools=code
    

    Here’s how to add the local Replicate MCP server in code mode as a tool in Claude Code:

    claude mcp add "replicate-code-mode" --scope user --transport stdio -- npx -y replicate-mcp@alpha --tools=code
    

    Code mode is currently experimental and subject to change. It requires Node.js and Deno to be installed locally. Remote cloud sandboxing support is planned but not yet available.

    To get started with Code mode, see the Code mode documentation or visit the demo GitHub repo.

    Original source Report a problem
  • Nov 7, 2025
    • Date parsed from source:
      Nov 7, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    • Modified by Releasebot:
      Jan 14, 2026
    Replicate logo

    Replicate

    The little things, week ending November 7, 2025

    Replicate rolls out multimedia upgrades with video support in before/after sliders, pixelated image rendering with size controls, improved aspect ratios, and a UI focus overhaul. Deployment search added, input-order bug fixed, plus enterprise deployment monitoring and new docs posts.

    Web

    • Added video support to the before/after slider, allowing side-by-side comparison of video outputs from models like video upscaling and style transfer
    • Added pixelated image rendering with size controls (1x, 2x, fit) in the playground for pixel art models
    • Improved support for aspect ratios in prediction outputs
    • Overhauled focus states across the Replicate UI library
    • Add search to Deployments
    • Fixed a bug that caused model input fields to display in a different order
    • Added FAQs to every collection page

    Platform

    • Launched deployment setup monitoring for enterprise customers with automatic email notifications when deployments fail setup and customizable setup timeouts

    Docs

    • Published blog post on how to run FLUX.2 on Replicate, Black Forest Lab’s most advanced image generation model
    • Published blog post on how to run Isaac 0.1 on Replicate, an open-weight vision-language model
    Original source Report a problem
  • Nov 3, 2025
    • Date parsed from source:
      Nov 3, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    Deployment setup monitoring

    Enterprise customers gain deployment controls with automatic failure notifications and customizable setup timeouts. Get emailed alerts or Slack integrations for deployment issues, plus adjustable model setup timeout to fit large downloads or heavy initialization.

    Enterprise deployment controls

    Setup failure notifications

    If a deployment fails during the model’s setup function, we’ll notify you via email. This helps you catch issues with your model earlier, before your users reach out.

    If you use Slack, you can configure these emails to be sent to a Slack channel using their send emails to Slack feature.

    Custom setup timeouts

    You can customize the timeout for your deployment’s model setup function. The default timeout is 10 minutes, which works for most models. But if your model needs to download large files, load trained weights, or perform other expensive initialization operations, you can give it more time before we mark it as failed.

    You can configure both of these in your deployment settings.

    Read more in our deployment monitoring docs or reach out to [email protected] to learn about our enterprise plans.

    Original source Report a problem
  • Oct 24, 2025
    • Date parsed from source:
      Oct 24, 2025
    • First seen by Releasebot:
      Dec 20, 2025
    Replicate logo

    Replicate

    The little things, week ending October 24, 2025

    Veo releases a public beta Python SDK to run AI models from Python with full HTTP API support. Web, API and Docs get improvements from output handling and keyboard shortcuts to prediction deadlines, source tracking, and a new deployment monitoring docs section.

    SDK

    • Our Python SDK public beta is out now, making it even easier to run AI models from your Python code with full support for every operation in our HTTP API

    Web

    • Improved playground handling of different output and streaming types
    • Fixed search keyboard shortcut in docs to use Cmd+/ on Mac and Ctrl+/ on Windows for consistency across platforms

    API

    • Launched prediction deadlines so you can automatically cancel predictions that don’t complete within a specified duration
    • Added source field to prediction API responses to indicate whether predictions were created via web or api

    Docs

    • Split deployment monitoring into its own docs section with detailed information about metrics, GPU memory monitoring, and performance tracking
    • Published blog post about how to prompt Veo 3.1 with guidance on reference images, frame control, and image-to-video features
    Original source Report a problem

Related vendors