AI Image and Video Release Notes

Release notes for AI image, text-to-video, image editing, and generative media platforms

Get this feed:

Products (11)

Latest AI Image and Video Updates

May 29, 2026
- Date parsed from source:
  May 29, 2026
- First seen by Releasebot:
  May 30, 2026
Runway API by Runway AI

HappyHorse 1.0

Runway API releases HappyHorse 1.0 for text and image video generation with 3-15 second clips and 720p to 1080p support.

May 29th, 2026 - HappyHorse 1.0 (happyhorse_1_0) is now available via the Runway API. Generate videos from text or from a first-frame image with durations from 3–15 seconds. Text-to-video supports 10 output dimensions across 720p and 1080p. Image-to-video preserves your input aspect ratio and accepts an optional motion prompt. Use the text to video and image to video endpoints to get started.
Original source
May 28, 2026
- Date parsed from source:
  May 28, 2026
- First seen by Releasebot:
  May 29, 2026
Black Forest Labs

FLUX VTO: Virtual Try-On at scale

Black Forest Labs launches FLUX VTO, a public virtual try-on model for apparel that delivers fast, catalog-scale try-ons with strong identity and garment fidelity, supports multiple garments and layering, and is available in the FLUX MCP and a BFL Shop demo.

What if every shopper could see themselves in an outfit before buying? What if that experience ran fast, at catalog scale, with the garment rendered exactly as it exists?

That's what FLUX VTO is built for. Apparel runs on emotional selling, and try-on is the digital version of that moment when a shopper pictures themselves in the piece and decides they want it.

Why try-on has stayed stuck in the demo phase

Virtual try-on has been promised for years, but production deployments are rare. The reasons are familiar to anyone who has evaluated the category. Models drift between generations, so identity, hair and pose shift in ways that make outputs unusable on a live product page. Garments fare no better: logos disappear, stitching degrades, prints render incorrectly and buttons vanish. Even when the geometry is right, the look is often wrong, with outputs that don't match a brand's contrast, highlights or aesthetic.

And underneath all of that sits the economics problem. Existing try-on models take 10 to 30 seconds per generation, which is too slow for interactive shopping and too expensive to run across a full catalog.

Any one of these is enough to keep try-on out of production. Together they explain why so few catalogs have shipped it.

What FLUX VTO does differently

FLUX VTO is engineered for real shopping experiences, not demos. It clears the bar on the three dimensions retailers actually care about.

Speed and cost that work at catalog scale. Generations complete in under four seconds, fast enough to feel interactive in a consumer flow. The model is also cheaper to run than comparable systems, which is what makes try-on viable across thousands of SKUs rather than a curated handful.

Fidelity on both sides of the image. Identity is preserved across generations, and garments come through with their logos, prints, stitching and hardware intact. The output looks like the person wearing the actual product, not a reinterpretation of it.

Styling flexibility. Apply up to four garments to a single model at once, transfer full outfits or individual pieces between models, and layer items such as a shirt under a jacket with correct interaction between them.

What to know before you build with it

See the look, explore the fit. FLUX VTO shows how a garment looks on a person, making it easy to explore styling, silhouette, and overall fit. Precise body and garment sizing is still evolving, so outputs are best used as visual styling guidance.

Default moderation is on. Swimwear and lingerie are not supported under the default policy. Uploads must also follow standard safety guidelines: no adult-oriented content, no child sexual abuse imagery, no non-consensual or sexually explicit content and no dangerous, derogatory or shocking material.

Use rights matter. Only upload photos of yourself or photos where you have explicit rights to use the subject's likeness.

Need to run this even faster? You can also self-host the model and run the try-on in sub-second response time!

Go give it a try!

FLUX VTO is available publicly now. Check out the docs and try on some BFL Merch in our interactive and free to use BFL Shop Demo.

Also available in our FLUX MCP now!
Original source
All of your release notes in one feed

Join Releasebot and get updates from Runway AI and hundreds of other software products.

Create account
Get updates with:
May 28, 2026
- Date parsed from source:
  May 28, 2026
- First seen by Releasebot:
  Apr 17, 2026
- Modified by Releasebot:
  May 29, 2026
Runway API by Runway AI

Seedance 2.0 in Runway API

Runway API adds Seedance 2.0 video generation with keyframes, reference media, and generated audio.

May 28th, 2026 - Seedance 2.0 is now available via the Runway API. Generate high-quality videos from text, images, or video with support for keyframe control, reference images, reference videos, and generated audio. Seedance 2.0 supports image-to-video, text-to-video, and video-to-video generation modes, with durations from 4–15 seconds.
Original source
May 27, 2026
- Date parsed from source:
  May 27, 2026
- First seen by Releasebot:
  May 27, 2026
Midjourney

Web Updates

Midjourney improves conversational mode with richer voice and text input support, adds a Rerun as HD button for V8.1 images, and enhances Create and Organize with better folder counts, mobile controls, and a series of useful fixes across search, uploads, subscriptions, and job handling.
Conversational mode has been improved across text and voice input

When a voice session starts, it has access to your Image Prompts, Style References, sidebar settings, and recent jobs.

Image Prompts now work from the tray and sidebar.

Tray images persist across voice submissions until you remove them.

You can add references without exiting voice.

A new Rerun as HD button makes it easy to rerun any V8.1 image generated in Standard Definition (SD) as High Definition (HD).

Folder views on the Create and Organize pages now show a hidden-item count so you can see how much is filtered out. The filter reset button has moved from the bottom to the top of the filter bar.

On mobile web, the settings menu now groups Profiles, Moodboards, and Liked Styles together. The More Options panel is available in the mobile lightbox.

Improvements and bug fixes

Search on the Create and Organize pages now works for signed-in members without an active subscription.

Upload error message now shows the correct 20 MB limit.

Failed jobs no longer show Vary or Upscale buttons that can't run.

Rate-limit message now appears when an upload is throttled.

Prompt scrolling on the Create page no longer traps the scroll wheel.

Subscribe button no longer gets stuck after navigating back.

Activating a Niji 6 or V6 Personalization profile no longer immediately deselects it.

Copy confirmation now waits for the clipboard write to finish.

Corrections for inaccuracies in the Account Settings subscription status panel.

Original source
May 2026
- No date parsed from source.
- First seen by Releasebot:
  May 22, 2026
Stability AI

Stable Audio 3.0

Stability AI releases Stable Audio 3.0, an open-weight music model family trained on fully licensed data. It brings longer variable-length generation, on-device full song composition, audio inpainting, and LoRa customization, with models available on Hugging Face, API, and enterprise self-hosting.
Meet Stable Audio 3.0, the model family built for artistic experimentation with open-weight models

Key Takeaways:

We're releasing Stable Audio 3.0, a model family with open-weights music models that are trained on fully licensed data.

You own your outputs and can distribute and commercialize them under the Stability AI Community License, or the Enterprise License for organizations with more than $1M in revenue.

Key innovations include variable-length generation up to six minutes, and full song composition on portable devices.

Stable Audio 3.0 Small and Medium are available on Hugging Face. You can download the weights here.

Stable Audio 3.0 Large is available via the Stability AI API and self-hosting for enterprise deployments. Try it out here.

Today we're releasing Stable Audio 3.0, a model family trained on fully licensed data, designed to be the foundation for what the audio community builds next. Three of the models are open weights, free to download and build on.

Music has always evolved through the collective creativity of its community. Remix culture, interpolations, and mashups are how artists build on each other's work and push the art form forward. Generative audio will be no different. We want to foster the same kind of community-driven innovation in audio that we sparked in image generation with the launch of Stable Diffusion.

Stable Audio 3.0 is our open invitation to experiment with generative audio. We believe the best innovations are still waiting to be built.

Meet the Stable Audio 3.0 model family

We’re releasing four new models designed for different use cases and deployment options:

Stable Audio 3.0 Small SFX: Sound effects generation on-device, such as mobile phones and consumer-grade laptops.

Stable Audio 3.0 Small: Full music composition on-device.

Stable Audio 3.0 Medium: Higher musicality (i.e. structure, melodic coherence, and phrasing) and longer track length at up to 6:20.

Stable Audio 3.0 Large: The most advanced musicality in the family, built for music platforms and creative applications that need low-latency generation at high volume.

Open for experimentation, with ownership of what you create

All Stable Audio 3.0 models are trained on fully licensed data. Under the Stability AI Community License, you own your outputs and can distribute and commercialize them freely.

For organizations with more than $1M in annual revenue, you can get commercial coverage with our Enterprise license. We also offer legal indemnification under the Enterprise license.

3.0 Small SFX, 3.0 Small, and 3.0 Medium are all open-weights. To our knowledge, other open music models either restrict commercial use or carry the risks associated with being trained on unlicensed music.

Architectural advancements for variation and iteration

Stable Audio 3.0 is our next-gen architecture, built with a novel semantic-acoustic autoencoder that enables longer, more flexible audio generation. You can read the full research paper here.

Variable-length generation, up to more than six minutes. Stable Audio 3.0 introduces a new method for variable-length audio generation that enables you to generate exactly what you need, at per-second granularity.

3.0 Small generates up to two minutes, compared to 11 seconds from Stable Audio Open Small, and 47 seconds from Stable Audio Open. 3.0 Medium and 3.0 Large generate more than six minutes.

Full music composition on-device. To our knowledge, 3.0 Small is the only model capable of full music composition on-device. For the first time, on-device and offline audio generation isn't limited to short samples; it can produce complete musical tracks.

Customize the models on your own library with support for LoRa training. A LoRa is an efficient method for fine-tuning that was first made popular in image generation, and is now an emerging method for customizing audio generation models.

For the first time we're publishing documentation for LoRa training, alongside the weights for 3.0 Small and 3.0 Medium. For organizations with our Enterprise license, we offer the option of white-glove support with fine-tuning.

Audio inpainting options. Modify a segment of a track, rework part of a song, or extend your composition without starting over. Stable Audio 3.0 supports single-segment editing, multi-segment editing, and causal continuation (extending audio beyond its original endpoint).

Setting the stage for what’s next

Stable Audio 3.0 is the new architecture on which we're already building our next generation of fully licensed audio models for professionals.

While responsibly trained generative AI models are critical, they are not enough on their own. Artist-centric AI will only win if the product experience on a licensed platform is better than the experience on an unlicensed platform.

We're also working on a suite of new products for musicians. Join the waitlist to get early access.

In the meantime, you can learn more about our partnerships with Universal Music Group and Warner Music Group.

Get started with Stable Audio 3.0 now

Open weights: Download 3.0 Small SFX, 3.0 Small, and 3.0 Medium on Hugging Face. For organizations with more than $1M in annual revenue, contact us to discuss our Enterprise Licensing.

API: Stable Audio 3.0 Large is available via the Stability AI API.

Partner platforms: Stable Audio 3.0 will be available on ComfyUI and other platforms.

To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source
May 21, 2026
- Date parsed from source:
  May 21, 2026
- First seen by Releasebot:
  May 22, 2026
Black Forest Labs

FLUX Erase: Remove anything, leave no trace

Black Forest Labs launches FLUX Erase, an API-powered image removal model that erases masked objects, people, text and watermarks while reconstructing the scene behind them. It promises cleaner results, optional edge expansion, and fast, lower-cost performance.
A stray person in a product shot. A cable cutting through a landscape. Text baked into a scene. These are the types of details that can break an otherwise usable image, and correcting them manually takes a lot of time.

FLUX Erase removes whatever you mask - including its traces like shadows and subtle parts missed by the mask - and reconstructs the scene behind it coherently.

The problem with existing approaches

Visible artifacts: most removal tools leave halos, smearing, or inconsistent texture at the edges of the removed area

Incomplete reconstruction: the background fill is generated without understanding the full scene context, producing results that need manual correction

Limited scope: tools trained narrowly on object removal struggle with text, people, watermarks, or compositionally complex scenes

FLUX Erase

Pass an image and a binary mask. The model erases whatever you've marked and reconstructs the scene behind it, matching lighting, texture, and background, no prompt required.

It works across objects, people, text, and anything else the mask defines. An optional edge expansion setting lets you expand the mask slightly for cleaner results on complex or soft-edged subjects.

FLUX Erase matches the quality of frontier object-removal models at a fraction of the price and latency.

We evaluated FLUX Erase against other state-of-the-art models on a held-out benchmark of 198 mask-based object-removal test images. FLUX Erase wins decisively against GPT Image-2 (68.5%) and Finegrain Eraser Standard (63.2%), ties Nano Banana 2 (49.5%), and lands closely behind Nano Banana Pro (47.3%) - putting it on par with the current frontier of mask-based object-removal while running lightning fast and at a substantially lower cost.

Get access

Try the public demo.

FLUX Erase is available via the BFL API.

View docs.

View Pricing.
Original source
May 21, 2026
- Date parsed from source:
  May 21, 2026
- First seen by Releasebot:
  May 22, 2026
Runway AI

Latest updates

Runway AI adds Aleph 2.0 in Edit Studio for frame-based video editing that matches the rest of the clip.

Paid Plans

Aleph 2.0 & Edit Studio

Get the video you need from the video you already have. Edit a frame the way you want, and Aleph 2.0 — our upgraded video editing model, now in Edit Studio — edits the rest of your video to match.
Original source
May 20, 2026
- Date parsed from source:
  May 20, 2026
- First seen by Releasebot:
  Apr 16, 2026
- Modified by Releasebot:
  May 22, 2026
Stability AI

Stable Audio 3.0

Stability AI releases an API for Stable Audio 3.0, its most advanced AI audio model for coherent musical tracks up to six minutes long, with new audio-to-audio editing that lets users upload samples and transform them with natural language prompts.

Today we are releasing an API for Stable Audio 3.0, our most advanced AI audio generation model that produces high-quality, coherent musical tracks up to six minutes long at 44.1kHz stereo. Introducing groundbreaking audio-to-audio capabilities, Stable Audio 3.0 enables users to upload and transform audio samples using natural language prompts.

Read more about the model capabilities here.

Stable Audio 3.0 was exclusively trained on licensed data from the AudioSparx music library, honoring opt-out requests and ensuring fair compensation for creators.

Try Stable Audio 3.0 for free at stableaudio.com.
Original source
May 14, 2026
- Date parsed from source:
  May 14, 2026
- First seen by Releasebot:
  May 15, 2026
Black Forest Labs

FLUX Outpainting: Extend any image, in any direction

Black Forest Labs launches FLUX Outpainting, a purpose-built image expansion endpoint that helps create seamless, photorealistic outpainting with flexible canvas control and up to 4MP output. It is available now via the BFL API with a public demo and API docs.
Expanding an image beyond its original frame is harder than it should be.

Most outpainting tools today still give you visible seams, broken lighting, or loss of context. We built FLUX Outpainting to solve this, and you can try it now.

The problem with existing approaches

Seams and artifacts: most outpainting tools produce visible boundaries where the generated content meets the original (inconsistent lighting, broken edges, mismatched texture)

Prompt dependency: models that require detailed text prompts to expand a scene introduce extra steps and unpredictable outputs

Rigid formats: changing an image's aspect ratio means rebuilding content rather than extending it

FLUX Outpainting

FLUX Outpainting is a purpose-built expansion endpoint. Pass an image, define your target canvas size and placement, and get back a seamlessly extended result: coherent, photorealistic, and ready to use.

What makes it different:

Natural scene extension: the model is optimized for visually coherent continuation, carrying lighting, texture, depth, and composition through without instruction

Flexible canvas control: define the full output dimensions and image placement coordinates directly, maps cleanly to canvas-based UIs and integrates straight into the API

Up to 4MP output: full-resolution results, ready for production

Get access

FLUX Outpainting is available via the BFL API.

Try the public demo.

View API Docs

View Pricing
Original source
May 13, 2026
- Date parsed from source:
  May 13, 2026
- First seen by Releasebot:
  May 14, 2026
Runway AI

Latest updates

Runway AI launches Runway Agent, an AI creative partner for developing and generating finished videos in one conversation.

All Plans

Runway Agent

Work with an AI creative partner to develop and produce a finished video. Describe the video you need, refine the direction together and generate it all in one conversation.
Original source

AI Image and Video Release Notes

Products (11)

Latest AI Image and Video Updates

HappyHorse 1.0

FLUX VTO: Virtual Try-On at scale

Why try-on has stayed stuck in the demo phase

What FLUX VTO does differently

What to know before you build with it

Seedance 2.0 in Runway API

Web Updates

Conversational mode has been improved across text and voice input

Improvements and bug fixes

Stable Audio 3.0

Meet Stable Audio 3.0, the model family built for artistic experimentation with open-weight models

Key Takeaways:

Meet the Stable Audio 3.0 model family

Open for experimentation, with ownership of what you create

Architectural advancements for variation and iteration

Setting the stage for what’s next

Get started with Stable Audio 3.0 now

FLUX Erase: Remove anything, leave no trace

The problem with existing approaches

FLUX Erase

Get access

Latest updates

Paid Plans

Aleph 2.0 & Edit Studio

Stable Audio 3.0

FLUX Outpainting: Extend any image, in any direction

The problem with existing approaches

FLUX Outpainting

What makes it different:

Get access

Latest updates

All Plans

Runway Agent