Runway AI Release Notes

Last updated: Oct 30, 2025

Runway AI Products

All Runway AI Release Notes

  • Oct 24, 2025
    • Parsed from source:
      Oct 24, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway AI

    Paid Plans

    Create your own custom node-based workflows chaining together multiple models, modalities and intermediary steps for even more control of your generations. Available now.

    Original source Report a problem
  • Oct 16, 2025
    • Parsed from source:
      Oct 16, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway API by Runway AI

    ElevenLabs Voice Dubbing

    ElevenLabs Dubbing is available directly via the Runway API. Translate your content into 29 languages with AI-generated speech that maintains the speaker’s original voice characteristics and emotional tone.

    Original source Report a problem
  • Oct 16, 2025
    • Parsed from source:
      Oct 16, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway API by Runway AI

    ElevenLabs Clean Audio

    ElevenLabs Voice Isolation is available directly via the Runway API. Strip background noise from any recording and isolate crisp, clear speech—built for film, podcast, and interview workflows.

    Original source Report a problem
  • Oct 15, 2025
    • Parsed from source:
      Oct 15, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway API by Runway AI

    Google Veo 3.1

    Google Veo 3.1

    Google Veo 3.1 text to image and image to video are now available in the Runway API. Generate with even greater fidelity and control with first and last keyframe support, the new Reference to Video feature and support for full 1080p outputs. Learn more at the link below.

    Original source Report a problem
  • Oct 8, 2025
    • Parsed from source:
      Oct 8, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway API by Runway AI

    Flexible Generation Length

    Flexible generation times for Runway video models are now available via the Runway API. Choose any duration from 2-10 seconds using Gen-4 Turbo. Pay only for what you generate.

    Original source Report a problem
  • Oct 8, 2025
    • Parsed from source:
      Oct 8, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway API by Runway AI

    ElevenLabs text to Sound Effects

    Flexible generation times for Runway video models are now available via the Runway API. Choose any duration from 2-10 seconds using Gen-4 Turbo. Pay only for what you generate.

    Original source Report a problem
  • October 2025
    • No date parsed from source.
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway AI

    Introducing Runway Gen-4

    Runway Gen-4 launches a groundbreaking multi-scene AI that preserves consistent characters, objects, and environments across shots from one reference. It adds production-ready video, physics-aware world modeling, and fast GVFX, enabling seamless storytelling without extra training.

    Introducing Runway Gen-4

    Our next-generation series of AI models for media generation and world consistency.

    A new generation of consistent and controllable media is here.
    With Gen-4, you are now able to precisely generate consistent characters, locations and objects across scenes. Simply set your look and feel and the model will maintain coherent world environments while preserving the distinctive style, mood and cinematographic elements of each frame. Then, regenerate those elements from multiple perspectives and positions within your scenes.

    Gen-4 can utilize visual references, combined with instructions, to create new images and videos utilizing consistent styles, subjects, locations and more. Giving you unprecedented creative freedom to tell your story.

    All without the need for fine-tuning or additional training.

    RUNWAY GEN-4

    Narrative Capabilities

    A collection of short films and music videos made entirely with Gen-4 to test the model's narrative capabilities.

    One simple interface, endless workflows and capabilities

    WORKFLOW – CONSISTENT CHARACTERS

    Infinite character consistency with a single reference image
    Runway Gen-4 allows you to generate consistent characters across endless lighting conditions, locations and treatments. All with just a single reference image of your characters.

    WORKFLOW – CONSISTENT OBJECTS

    Whatever you want, everywhere you need it
    Place any object or subject in any location or condition you need. Whether you’re crafting scenes for long form narrative content or generating product photography, Runway Gen-4 makes it simple to generate consistently across environments.

    WORKFLOW – COVERAGE

    Get every angle of any scene
    To craft a scene, simply provide reference images of your subjects and describe the composition of your shot. Runway Gen-4 will do the rest.

    CAPABILITIES – PRODUCTION-READY VIDEO

    A new standard for quality and language understanding for video generation
    Gen-4 excels in its ability to generate highly dynamic videos with realistic motion as well as subject, object and style consistency with superior prompt adherence and best in class world understanding.

    CAPABILITIES – PHYSICS

    A step towards Universal Generative Models that understand the world
    Runway Gen-4 represents a significant milestone in the ability of visual generative models to simulate real world physics.

    WORKFLOW – GVFX

    A new kind of visual effects
    Fast, controllable and flexible video generation that can seamlessly sit beside live action, animated and VFX content.

    Original source Report a problem
  • Sep 25, 2025
    • Parsed from source:
      Sep 25, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway API by Runway AI

    Text to Speech

    Starting today, ElevenLabs’ Multilingual v2 Text to Speech is available directly via the Runway API. Generate natural, emotionally-aware speech in 29 languages while maintaining consistent voice quality and personality.

    Original source Report a problem
  • Sep 24, 2025
    • Parsed from source:
      Sep 24, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway API by Runway AI

    API Playground

    Today we’re launching the Runway API Playground—a new interactive environment that lets developers test and refine their integrations before going to production. Build with confidence and ship faster. The Runway API Playground provides a full sandbox environment with all our latest models.

    Original source Report a problem
  • Sep 24, 2025
    • Parsed from source:
      Sep 24, 2025
    • Detected by Releasebot:
      Oct 30, 2025
    Runway AI logo

    Runway AI

    Autoregressive-to-Diffusion Vision Language Models

    Introducing A2D-VL 7B, a diffusion-based vision language model that enables fast parallel generation by adapting an autoregressive VLM to diffusion decoding. It delivers faster speed with preserved quality, much lower training compute, and KV caching support, outperforming prior diffusion VLMs on VQA benchmarks.

    We adapt powerful pretrained VLMs for parallel diffusion decoding

    We present a novel diffusion VLM, A2D-VL 7B (Autoregressive-to-Diffusion) for parallel generation by finetuning an existing autoregressive VLM, Qwen2.5-VL, on the diffusion language modeling task. In particular, we adopt the masked diffusion framework which "noises" tokens by masking them and "denoises" tokens by predicting the original tokens. We propose novel adaptation techniques (Fig. 2) that gradually increase the task difficulty during finetuning to smoothly transition from sequential to parallel decoding while preserving the base model's capabilities.

    Further, we present novel adaptation techniques for finetuning autoregressive models into diffusion models while retaining the base model's core capabilities:

    • Block size annealing. The block diffusion framework enables interpolation between autoregressive and diffusion modeling: when blocks contain single tokens, we recover sequential decoding. We leverage this by gradually increasing the diffusion prediction window throughout finetuning, starting from smaller blocks and progressing to our target size of 8 tokens. This gradual progression prevents the aggressive parameter updates that would otherwise erase the base model's capabilities.

    • Noise level annealing. Within each token block, we apply position-dependent masking to gradually transition from easier to harder prediction tasks. Early in training, we mask the left-most tokens closest to the context more frequently (since they're easier to predict) and right-most tokens less frequently (since they're harder to predict). As training progresses, masking becomes uniform across positions, enabling any-order parallel generation within each block.

    We ablate these strategies in Fig. 3, which shows the importance of both strategies for preserving benchmark performance of the base model. Concurrently, also adapts Qwen2.5 for block diffusion decoding in NLP-only tasks. In contrast, we explore vision language models and propose novel adaptation techniques that are critical for retaining model capabilities.

    Our design overcomes the limitations of prior diffusion VLMs:

    • Efficient training. By adapting existing VLMs, our approach requires significantly less training than training diffusion VLMs from scratch. While LLaDA-V 8B trains on ≥12M visual QA pairs, our A2D-VL 7B is trained on 400K pairs.

    • Modern architecture. By adapting Qwen2.5-VL, we adopt their modern architectural components, such as support for native visual resolutions and multimodal positional encodings.

    • Improved quality in long-form responses. We use diffusion decoding in blocks of 8 tokens, which enhances both response quality and the model's ability to generate arbitrary-length outputs. Further, our training data contains 100K high-quality reasoning traces from the larger Qwen2.5-VL 72B while rely on standard instruction-tuning data, with also distilling from a 7B math/science reasoning model. To enhance response flexibility, we also include 50K samples from MAmmoTH-VL in our data mixture similar to.

    • KV caching support. Under the block diffusion training and inference framework, A2D-VL sequentially generates a block of tokens at a time using block-causal attention (attending only to previous blocks and tokens within the current block) rather than fully bidirectional attention. As a result, A2D-VL supports exact KV caching of previously generated blocks instead of relying on approximate methods.

    We strike a new balance between speed and performance

    By adapting pretrained autoregressive VLMs for diffusion, A2D-VL strikes an improved balance between inference speed and downstream performance. We compare the speed-quality trade-off between A2D-VL 7B, Qwen2.5-VL 7B, and the diffusion VLM LLaDA-V 7B. For LLaDA-V, we follow the recommended settings: approximate KV caching with recomputation every 32 steps and "factor"-based confidence thresholding.

    Detailed image captioning

    We generate detailed image captions (≤ 512 tokens) and, similarly to, score them against captions generated by GPT-4o, GPT-4V, and Gemini-1.5-Pro using BERTScore to measure semantic similarity. Captions generated by A2D-VL achieve greater consistency with the reference captions compared to prior diffusion VLM LLaDA-V.

    Chain-of-thought reasoning

    A2D-VL consistently achieves better MMMU-Pro accuracy with chain-of-thought prompting compared to LLaDA-V. For Qwen2.5 and A2D-VL, we generate up to 16k tokens. For LLaDA-V, we limit the response to 512 tokens as the accuracy degrades at longer output lengths.

    General visual understanding

    A2D-VL outperforms prior diffusion VLMs on 3 out of 5 visual question-answering benchmarks with minimal performance degradation relative to the base Qwen model.

    Conclusion

    We introduce Autoregressive-to-Diffusion (A2D) vision language models for faster, parallel generation by adapting existing autoregressive VLMs to diffusion decoding. A2D-VL outperforms prior diffusion VLMs in visual question-answering while requiring significantly less training compute. Our novel adaptation techniques are critical for retaining model capabilities, finally enabling the conversion of state-of-the-art autoregressive VLMs to diffusion with minimal impact to quality.

    Original source Report a problem

Related vendors