AI Image and Video Release Notes
Release notes for AI image, text-to-video, image editing, and generative media platforms
Products (9)
Latest AI Image and Video Updates
- April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Introducing Stable Virtual Camera: Multi-View Video Generation with 3D Camera Control
Stability AI releases Stable Virtual Camera in research preview, turning 2D images into immersive 3D videos with realistic depth and perspective. It adds precise user-controlled camera paths, flexible inputs, and smooth long-form outputs for research use.
Capabilities
Today, we're releasing Stable Virtual Camera, currently in research preview. This multi-view diffusion model transforms 2D images into immersive 3D videos with realistic depth and perspective—without complex reconstruction or scene-specific optimization. We invite the research community to explore its capabilities and contribute to its development.
A virtual camera is a digital tool used in filmmaking and 3D animation to capture and navigate digital scenes in real-time. Stable Virtual Camera builds upon this concept, combining the familiar control of traditional virtual cameras with the power of generative AI to offer precise, intuitive control over 3D video outputs.
Unlike traditional 3D video models that rely on large sets of input images or complex preprocessing, Stable Virtual Camera generates novel views of a scene from one or more input images at user specified camera angles. The model produces consistent and smooth 3D video outputs, delivering seamless trajectory videos across dynamic camera paths.
The model is available for research use under a Non-Commercial License. You can read the paper here, download the weights on Hugging Face, and access the code on GitHub.
Stable Virtual Camera offers advanced capabilities for generating 3D videos, including:
- Dynamic Camera Control: Supports user-defined camera trajectories as well as multiple dynamic camera paths, including: 360°, Lemniscate (∞ shaped path), Spiral, Dolly Zoom In, Dolly Zoom Out, Zoom In, Zoom Out, Move Forward, Move Backward, Pan Up, Pan Down, Pan Left, Pan Right, and Roll.
- Flexible Inputs: Generates 3D videos from just one input image or up to 32.
- Multiple Aspect Ratios: Capable of producing videos in square (1:1), portrait (9:16), landscape (16:9), and other custom aspect ratios without additional training.
- Long Video Generation: Ensures 3D consistency in videos up to 1,000 frames, enabling seamless loops and smooth transitions, even when revisiting the same viewpoints.
Research & model architecture
Stable Virtual Camera achieves state-of-the-art results in novel view synthesis (NVS) benchmarks, outperforming models like ViewCrafter and CAT3D. It excels in both large-viewpoint NVS, which emphasizes generation capacity, and small-viewpoint NVS, which prioritizes temporal smoothness.
Stable Virtual Camera is trained with a fixed sequence length as a multi-view diffusion model, taking a set number of input and target views (M-in, N-out).
Stable Virtual Camera is trained as a multi-view diffusion model with a fixed sequence length, using a set number of input and target views (M-in, N-out). During sampling, it functions as a flexible generative renderer, accommodating variable input and output lengths (P-in, Q-out). This is achieved through a two-pass procedural sampling process—first generating anchor views, then rendering target views in chunks to ensure smooth and consistent results.
For a deeper dive into the model’s architecture and performance, you can read the full research paper here.
Model limitations
In its initial version, Stable Virtual Camera may produce lower-quality results in certain scenarios. Input images featuring humans, animals, or dynamic textures like water often lead to degraded outputs. Additionally, highly ambiguous scenes, complex camera paths that intersect objects or surfaces, and irregularly shaped objects can cause flickering artifacts, especially when target viewpoints differ significantly from the input images.
Get started
Stable Virtual Camera is free to use for research purposes under a Non-Commercial License. You can read the paper and download the weights on Hugging Face and code on GitHub.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Stable Diffusion Now Optimized for AMD Radeon™ GPUs and Ryzen™ AI APUs
Stability AI releases AMD-optimized Stable Diffusion models on Hugging Face, bringing faster ONNX performance for Radeon GPUs and Ryzen AI APUs. The update includes Stable Diffusion 3.5 and SDXL variants, with accelerated inference and support through Amuse 3.0.
Key Takeaways
We’ve collaborated with AMD to deliver select ONNX-optimized versions of the Stable Diffusion family of models, engineered to run faster and more efficiently on AMD Radeon™ GPUs and Ryzen™ AI APUs.
AMD-optimized versions of Stable Diffusion 3.5 Large, Stable Diffusion 3.5 Large Turbo, Stable Diffusion XL 1.0, and Stable Diffusion XL Turbo are now available on Hugging Face and suffixed with “_amdgpu”. End users can try out the AMD optimized models using Amuse 3.0.
You can learn more about the technical details of these speed upgrades on AMD’s blog post.
We’ve collaborated with AMD to deliver select ONNX-optimized versions of the Stable Diffusion model family, engineered to run faster and more efficiently on AMD Radeon™ GPUs and Ryzen™ AI APUs. This joint engineering effort focused on maximizing inference performance without compromising model output quality or our open licensing.The result is a set of accelerated models that integrate into any ONNX Runtime-supported environment, making it easy to drop them into your existing workflows right out of the box. Whether you’re deploying Stable Diffusion 3.5 (SD3.5) variants, our most advanced image model, or Stable Diffusion XL Turbo (SDXL Turbo), these models are ready to power faster creative applications on AMD hardware.
As generative visual media adoption accelerates, it’s essential our models are optimized for leading hardware. This collaboration ensures builders and businesses can integrate Stable Diffusion into their production pipelines, making workflows faster, more efficient, and ready to scale.
Original source Report a problem
Available models
AMD has optimized four models across SD3.5 and SDXL for improved performance.
SD3.5 Version:
Stable Diffusion 3.5 Large
Stable Diffusion 3.5 Large Turbo
AMD-optimized SD3.5 models deliver up to 2.6x faster inference when compared to the base PyTorch models.
SDXL Version:
Stable Diffusion XL 1.0
Stable Diffusion XL Turbo
With AMD optimization, SDXL 1.0 and SDXL Turbo achieve up to 3.8x faster inference, when compared to the base PyTorch models.
Analysis compares AMD-optimized model inference speed to the base PyTorch models. Testing was conducted using Amuse 3.0 RC and AMD Adrenalin 24.30.31.05 KB driver - 25.4.1 preview.
Get started
The AMD-optimized Stable Diffusion models are available now on Hugging Face and suffixed with “_amdgpu”. End users can also try out the AMD optimized models using Amuse 3.0. You can learn more about the technical details of these speed upgrades on AMD’s blog post.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community. All of your release notes in one feed
Join Releasebot and get updates from Stability AI and hundreds of other software products.
- April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Stability AI and Arm Collaborate to Release Stable Audio Open Small, Enabling Real-World Deployment for On-Device Audio Generation
Stability AI releases Stable Audio Open Small, a compact text-to-audio model built to run entirely on Arm CPUs for fast on-device generation. It can create short audio samples on smartphones in under 8 seconds and is now free for commercial and non-commercial use.
Key Takeaways
We’re open-sourcing Stable Audio Open Small, a 341 million parameter text-to-audio model optimized to run entirely on Arm CPUs. Designed for quickly generating short audio samples, it can produce up to 11 seconds of audio on a smartphone in less than 8 seconds.
This release builds on our collaboration with Arm to bring generative audio creation to smartphones, following our recent announcement at Mobile World Congress.
Developers can explore the new Arm Learning Path, which offers hands-on guidance using Stable Audio Open Small on Arm CPUs.
Stable Audio Open Small is now free for commercial and non-commercial use under the permissive Stability AI Community License. You can read the paper on arXiv, download the model weights on Hugging Face, and access the code on GitHub.
Bringing generative audio creation to mobile phones
We’re open-sourcing Stable Audio Open Small in partnership with Arm, whose technology powers 99% of smartphones globally. Building on the industry-leading text-to-audio model Stable Audio Open, the new compact variant is smaller and faster, while preserving output quality and prompt adherence.
This release follows our previously announced breakthrough that Stable Audio Open is now optimized to run on Arm CPUs, powered by Arm KleidiAI to enable AI-generated audio on a mobile phone. After demonstrating the technology in action at Mobile World Congress, Stability AI and Arm are now making the model weights available for anyone to access and deploy the model.
Technical advancements
To our knowledge, Stable Audio Open Small is the fastest stereo text-to-audio model on the market. You can read more about the technical advancements of the model in the research paper. Here are a few highlights:
- Lightweight: Stable Audio Open Small has 341M parameters, compared to Stable Audio Open’s 1.1B parameters.
- Fast: Stable Audio Open Small is optimized to generate audio on a mobile phone in less than 8 seconds. It’s faster to generate, and faster to fine-tune.
- Efficient: Leveraging Arm’s KleidiAI libraries, we designed this new model to run even more efficiently at the edge, so users get faster results back while lowering costs for compute time. By running entirely on Arm CPUs, Stable Audio Open Small is also accessible without heavy hardware requirements.
When to use the model
Like Stable Audio Open, Stable Audio Open Small is optimized for generating short audio samples, sound effects and production elements using text prompts. It is well suited for creating drum loops, foley, instrument riffs, and ambient textures.
Its compact size and fast inference make it a perfect fit for on-device deployment on Arm-powered smartphones and edge devices, where real-time generation and responsiveness matter.
As AI-driven creative media workloads move to the edge, smaller models help align compute resources with task complexity. By using different model sizes, organizations can allocate workloads to the processors best suited to their use case, like generating short sound effects versus full-length songs.
Getting started
Stable Audio Open Small is now free for commercial and non-commercial use under the permissive Stability AI Community License. You can read the paper on arXiv, download the model weights on Hugging Face, and access the code on GitHub.
Visit the Arm Learning Path to walk through deploying Stable Audio Open Small on Arm hardware as well as the Arm Community Blog for a deep technical dive into how Stable Audio Open Small was optimized for on-device performance.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Stable Video 4D 2.0: New Upgrades for High-Fidelity Novel-Views and 4D Generation from a Single Video
Stability AI upgrades Stable Video 4D to 2.0, bringing sharper, more consistent 4D outputs from a single video and stronger real-world video performance. The model now supports commercial and non-commercial use under the Stability AI Community License.
Key Takeaways
We’ve upgraded Stable Video Diffusion 4D (SV4D) to Stable Video 4D 2.0 (SV4D 2.0), delivering higher-quality outputs on real-world video.
Our analysis shows that SV4D 2.0 achieves state-of-the-art results in both 4D generations and novel-view synthesis.
Stable Video 4D 2.0 is now available for both commercial and non-commercial use under the permissive Stability AI Community License.
You can download the multi-view generation models on Hugging Face, find the code on GitHub, and read about the 4D asset reconstruction process on arXiv.
Stable Video 4D 2.0
We’ve upgraded Stable Video Diffusion 4D (SV4D) to Stable Video 4D 2.0 (SV4D 2.0), delivering higher-quality outputs on real-world video. This multi-view video diffusion model is ideal for dynamic 4D asset generation from a single object-centric video. These upgrades make it easier to create dynamic 4D assets for professional production workflows, from generating sprite sheets for in-game characters, to supporting assets for film and virtual worlds.
Multi-view generation remains complex due to the inherent ambiguity of visualizing 3D objects from unseen views. This is especially difficult when subjects are in motion. SV4D 2.0 makes incremental progress toward addressing this challenge by producing consistent, multi-angle outputs without relying on large datasets, multi-camera setups, or preprocessing. While this represents a step forward, occasional artifacts may still appear with dynamic motion.
What’s new
We’ve made multiple upgrades to SV4D 2.0, including:
- Sharper and Coherent 4D Outputs: The model was trained in phases, starting with static 3D assets and then adding motion, resulting in clearer and more consistent 4D results.
- No Reference Views Required: Works directly from a single video, eliminating the need for multi-view reference images.
- Redesigned Network Architecture: Utilizes 3D attention, a mechanism that fuses 3D spatial and temporal features, improving spatio-temporal consistency without relying on reference views.
- Improved Real-World Generalization: Performs more consistently on real-world videos. While trained on synthetic data, the model retains world knowledge from pre-trained video models.
Research and benchmarking
Our analysis shows that SV4D 2.0 achieves state-of-the-art results in 4D generation. It ranks first across all major benchmarks: LPIPS (Image fidelity), FVD-V (Multi-view consistency), FVD-F (Temporal coherence), and FV4D (4D consistency). Compared to DreamGaussian4D, L4GM, and SV4D, this version generates sharper and more consistent 4D outputs.
Our analysis also shows that SV4D 2.0 outperforms Diffusion^2, SV3D, and SV4D on novel-view synthesis.The model significantly improves multi-view consistency (FVD-V) and temporal coherence (FVD-F), maintaining high-quality outputs across both changing viewpoints and time. You can read more about the technical advancements of the model in the research paper.
Getting started
Stable Video 4D 2.0 is now available for both commercial and non-commercial use under the permissive Stability AI Community License.
You can download the multi-view generation models on Hugging Face, find the code on GitHub, and read about the 4D asset reconstruction process on arXiv.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Stable Diffusion 3.5 Models Optimized with TensorRT Deliver 2X Faster Performance and 40% Less Memory on NVIDIA RTX GPUs
Stability AI releases NVIDIA TensorRT-optimized Stable Diffusion 3.5 models, bringing faster image generation and lower VRAM use to more RTX GPUs. The update improves access for creative professionals and developers, with commercial and non-commercial use available under the Stability AI Community License.
Key Takeaways
We've collaborated with NVIDIA to deliver NVIDIA TensorRT -optimized versions of Stable Diffusion 3.5 (SD3.5), making enterprise-grade image generation available on a wider range of NVIDIA RTX GPUs.
The SD3.5 TensorRT-optimized models deliver up to 2.3x faster generation on SD3.5 Large and 1.7x faster on SD3.5 Medium, while reducing VRAM requirements by 40%.
The optimized models are now available for commercial and non-commercial use under the permissive Stability AI Community License .You can download the weights on Hugging Face and code on NVIDIA’s GitHub .
In collaboration with NVIDIA, we've optimized the SD3.5 family of models using TensorRT and FP8, improving generation speed and reducing VRAM requirements on supported RTX GPUs.
SD3.5 was developed to run on consumer hardware out of the box. The Nvidia optimizations extend that accessibility further for creative professionals and developers working across a variety of hardware setups.
Where the models excel
These performance improvements make SD3.5's core strengths more accessible. SD3.5 excels in the following areas, making it one of the most customizable image models on the market, while maintaining top-tier performance in prompt adherence and image quality:
- Versatile Styles: Capable of generating a wide range of styles and aesthetics like 3D, photography, painting, line art, and virtually any visual style imaginable.
- Diverse Outputs: Creates images representative of the world, not just one type of person, with different skin tones and features, without the need for extensive prompting.
- Prompt Adherence: Our analysis shows that SD3.5 Large leads the market in prompt adherence, allowing the model to closely follow a given text prompt, making it a top choice for efficient, high-quality performance.
Now available across more NVIDIA RTX GPUs
TensorRT optimization reduces model size while maintaining quality by streamlining how models run on NVIDIA hardware. Model size reduction is achieved through FP8 quantization, a technique that makes models more efficient while maintaining high output quality. These improvements mean that five RTX 50 Series systems can now run SD3.5 Large from memory, compared to just one system before optimization.
Enhanced performance across NVIDIA RTX GPUs
SD3.5 TensorRT-optimized models run more efficiently across NVIDIA GeForce RTX 50 and 40 Series GPUs, as well as NVIDIA Blackwell and Ada Lovelace generation NVIDIA RTX PRO GPUs. They deliver up to 2.3x faster generation on SD3.5 Large and 1.7x faster on SD3.5 Medium, while reducing VRAM requirements by 40%.
FP8 TensorRT boosts SD3.5 Large performance by 2.3x vs. BF16 PyTorch, with 40% less memory use. For SD3.5 Medium, BF16 TensorRT delivers a 1.7x speedup.
SD3.5 Large
2.3x faster image generation compared to compared to the base PyTorch models.
Memory use reduced by 40%, from 19GB to 11GB, all while maintaining professional quality.
SD3.5 Medium
1.7x faster image generation for users prioritizing speed and efficiency.
Lower memory footprint, ideal for creators working on mid-range RTX hardware.
Getting started
The optimized models are now available for commercial and non-commercial use under the permissive Stability AI Community License .You can download the weights on Hugging Face and code on NVIDIA’s GitHub .
To stay updated on our progress, follow us on X , LinkedIn , Instagram , and join our Discord Community .
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Introducing Stability AI Solutions: Generative AI Solutions to Accelerate Enterprise Creative Production
Stability AI introduces Stability AI Solutions, an enterprise offering for scaling creative production with generative AI. It adds custom models, flexible deployment, brand safety guardrails, compliance, indemnification, and dedicated support, with initial solutions for marketing, advertising, and design.
Today we’re introducing Stability AI Solutions, a new offering designed to help enterprises scale creative production with generative AI.
Each solution delivers custom models and workflows built with leading media generation and editing tools, along with everything needed to meet the standards of enterprise production: professional services, flexible deployment options, and built-in features such as brand safety guardrails, indemnification, compliance, and dedicated support.
Stability AI Solutions was developed in response to what we hear consistently from the market: increased demand to accelerate creative work without sacrificing quality or brand integrity. While many organizations see the potential of generative AI to address this challenge, they often struggle to turn that potential into real-world results.
We created the solutions offering to bridge this gap by partnering with enterprises to provide both the technology and the expertise needed to drive real business outcomes with generative AI.
Prem Akkaraju, Chief Executive Officer of Stability AI, said:
“We believe real transformation happens when technology is built around the needs of creatives, not the other way around. With the launch of Stability AI Solutions, we’re bringing that same philosophy to the enterprise. Many organizations have experimented with generative AI, but most tools fall short when it comes to the precision and control required for real production use. What’s needed is a partner, not just a platform. That is exactly what Stability AI Solutions provides.”
What’s available today
Our initial suite of solutions is tailored for the Marketing / Advertising / Design verticals, with more in development for Entertainment and Gaming:
- Stability AI for Product Photography: Transform a single product shot into photorealistic variations across different backgrounds, models, lighting, and styles, accelerating production timelines while scaling product shots.
- Stability AI for Brand Style: Generate media adhering to specific brand style standards, such as visual aesthetic, color palettes, sonic identity, and lighting. This ensures brand consistency across AI-generated creative assets.
- Stability AI for Product Concepting & Design: Develop new products and creative assets through rapid iteration and concept refinement capabilities, with custom sketch-to-image and image-to-image workflows.
- Stability AI for Digital Twins: Train custom models on intellectual property (IP) or likenesses, such as brand mascots or fashion models, to generate new assets with the appropriate usage rights licensed by the IP owner.
We’re already partnering with leading enterprises to accelerate creative production across Marketing / Advertising / Design use cases. Examples include fashion retailers transforming a single product shot into dozens of PDP-ready variants; apparel brands creating photorealistic design concepts from a sketch; and entertainment companies bringing beloved characters to life in new ways.
Options for deployment
Stability AI Solutions can be deployed in a variety of ways to meet different enterprise needs. Workflows can run on-premises for organizations that require full control over infrastructure. They can also be accessed through secure API endpoints for fully managed hosting and integration into existing systems, or used via web-based applications for quick access by creative teams. This flexibility allows teams to adopt generative AI in a way that aligns with their technical, operational, and security requirements.
As part of our ongoing collaboration with WPP, Stability AI Solutions will also be available for deployment through WPP Open. Additionally, we’re actively co-developing new use cases to support the evolving needs of WPP clients.
Stephan Pretorius, Chief Technology Officer of WPP, said:
“Integrating Stability AI’s solutions directly within WPP Open, our AI-powered marketing services platform, helps our clients stay at the forefront of innovation. Stability AI’s ability to deliver precise, customizable solutions across a range of marketing use cases ensures brand consistency on every level, while also unlocking entirely new creative possibilities.”
Getting started
You can learn how to get started on the Stability AI Solutions page or connect with an expert here.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Stability AI and NVIDIA Bring Faster Performance and Simplified Enterprise Deployment with the Stable Diffusion 3.5 NIM
Stability AI launches a new NVIDIA NIM microservice for Stable Diffusion 3.5, making enterprise deployment faster and simpler with performance gains, consolidated container support, and availability at build.nvidia.com.
Key Takeaways
Stability AI is launching a new NVIDIA NIM microservice for Stable Diffusion 3.5, making it faster and simpler for enterprises to deploy our most advanced models.
The SD3.5 NIM is available at build.nvidia.com. Users can download model weights directly from Hugging Face.
The models are available for commercial and non-commercial use under our permissive Stability AI Community License. For enterprises with annual revenue over $1M, please contact us to discuss our Enterprise Licensing, which offers additional support and customization options.
We're excited to announce our collaboration with NVIDIA to launch the Stable Diffusion 3.5 NIM microservice, enabling significant performance improvements and streamlined enterprise deployment for our leading image generation models. The SD3.5 NIM supports Stable Diffusion 3.5 Large, with expanded model compatibility planned for future releases.
The SD3.5 NIM delivers faster image generation on enterprise hardware. This enables complex image generation workflows that weren’t as feasible before.
A faster, easier way for enterprises to deploy Stable Diffusion models
A NIM provides a simplified, optimized way to run AI inference by packaging inference engines, APIs, and model configurations into secure, portable containers. Think of it as a pre-configured, enterprise-ready package that eliminates the complexity of setting up and optimizing AI models from scratch.
The SD3.5 NIM delivers performance gains that improve efficiency and ease of deployment for enterprises:
- Speed improvements: 1.8x performance gains over PyTorch, with NVIDIA H100 GPUs testing showing TensorRT-optimized generation at 3,700ms compared to 6,800ms for standard PyTorch on SD3.5 Large.
- Consolidated deployment: The SD3.5 NIM supports the SD3.5 Large model with Depth and Canny ControlNets within a single container, meaning that instead of needing separate deployments for each model, users get all versions packaged together.
The SD3.5 NIM supports enterprise and data center Ada and Blackwell GPUs.
Advanced workflows made practical for enterprise creative teams
These optimizations enable faster iteration cycles, larger batch processing, and more complex workflows that were previously less practical due to hardware limitations.
The efficiency gains are particularly valuable for advanced workflows, such as running multiple models simultaneously. The performance improvements make complex, multi-model workflows more feasible, opening new possibilities for advanced users and making rapid scaling a simpler option when utilizing cloud deployments.
Get started
The SD3.5 NIM is available at build.nvidia.com. Users can download Stable Diffusion 3.5 model weights directly from Hugging Face.
For smaller organizations and researchers getting started, the optimized models are available for commercial and non-commercial use under the permissive Stability AI Community License.
For enterprises with annual revenue over $1M, please contact us to discuss our Enterprise Licensing, with implementation support, customization options and professional services available. You can also visit Stability AI Solutions to learn more about customizing models and workflows for specific use cases.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Stability AI Introduces Stable Audio 2.5, the First Audio Model Built for Enterprise Sound Production at Scale
Stability AI launches Stable Audio 2.5, an enterprise-focused audio generation model with faster generation, stronger musical composition, and audio inpainting for more control. It’s available on StableAudio.com, via API and partner platforms, with on-premises enterprise options.
Key Takeaways
We’re launching Stable Audio 2.5, the first audio generation model designed specifically for enterprise-grade sound production.
Customized sound is an untapped differentiator for brands. Enterprises need to create their distinct sound for a growing volume of channels, from ads to the in-store experience.
Stable Audio 2.5 is purpose-built for this challenge of creating customizable, high-quality audio at scale. That includes elevated musical composition, fast inference at less than two seconds on a GPU, and support for more control with audio inpainting.
You can try Stable Audio 2.5 now at StableAudio.com or seamlessly deploy through the Stability AI API; partner platforms such as fal, Replicate, and ComfyUI; and on-premises with an enterprise license.
We’re excited to release Stable Audio 2.5, our latest audio model and the first developed for enterprise-grade use cases. Stable Audio 2.5 introduces advancements in quality and control that address the demand for dynamic compositions that can be adapted for custom brand needs.
Custom audio can make a brand eight times more memorable, but only 6% of creative uses a sound identity, according to Ipsos research. To deploy sound more strategically as an extension of their brand, enterprises need to create audio that’s high-quality, commercial-grade, and adaptable for the different places a brand shows up.
With the enterprise-focused capabilities of Stable Audio 2.5, professional creative teams can leverage more advanced, customizable audio generation to give every production the right sound.
What’s new: Faster generation, smarter composition, enhanced workflows
Stable Audio 2.5 brings advancements in speed and output quality that make it well-suited for commercial use cases.
Generate three-minute long tracks within seconds: Post-trained using the cutting-edge Adversarial Relativistic-Contrastive (ARC) method pioneered by the Stable Audio research team, Stable Audio 2.5 has an inference speed of less than two seconds on a GPU, for tracks up to three minutes.
Produce dynamic musical compositions: Stable Audio 2.5 is optimized for music and has improved musical structure, generating multi-part compositions (intro, development, and outro). The model also has improved prompt adherence, responding more effectively to mood descriptors (such as “uplifting”) and musical language across genres (“lush synthesizers”).
Get more control with audio inpainting support: In addition to text-to-audio and audio-to-audio workflows, Stable Audio 2.5 supports audio inpainting, which means users can input their own audio, select where they want it to start, and the model will use the context to generate the rest of the track. Note: Our Terms of Service require that uploads be free of copyrighted material, and we use advanced content recognition to maintain compliance and prevent infringement.
Like all Stable Audio models, Stable Audio 2.5 is commercially safe and trained on a fully licensed dataset.
Produce custom, brand-led audio with creative control and partnership
Audio influences brand engagement by 86%, but few brands are leveraging custom audio at scale. Enterprises have an opportunity to curate more intentional, on-brand audio across a growing variety of touchpoints – whether it’s an ad, the opening credits of a game, in-store music, the chimes of a credit card swipe, or a car stereo.
To help enterprises create the right sound, our team can fine-tune Stable Audio models on an organization’s sound library, embedding signature brand audio into custom generative workflows. This ensures that the music or soundscape is uniquely recognizable as part of a brand’s sonic identity or creative guidelines for a project.
With the launch of Stable Audio 2.5, Stability AI is also partnering with leading sound branding agency amp, part of the Landor Group, a WPP company, to co-develop enterprise solutions for innovative brands who want to create iconic sound identities and experiences. Stable Audio 2.5 will be available to WPP’s global client base through WPP Open, combining advanced technology with creative expertise.
Get started
You can try Stable Audio 2.5 now at StableAudio.com.
Stable Audio 2.5 is available through the Stability AI API, as well as through partner platforms including fal, Replicate, and ComfyUI.
For enterprises interested in deploying our audio models on their own infrastructure, please contact us to discuss our Enterprise Licensing, with implementation support, customization options and professional services available. You can also visit Stability AI Solutions to learn more about customizing audio models and workflows for specific use cases.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Stability AI Brings Image Services to Amazon Bedrock, Delivering End-to-End Creative Control with Enterprise-Grade Infrastructure
Stability AI launches Image Services on Amazon Bedrock, bringing professional-grade image editing to AWS as managed API tools. The suite adds granular control for workflows like inpainting, background removal, recoloring, style transfer, and sketch-to-image creation.
Key Takeaways
We are launching our Stability AI Image Services on Amazon Bedrock, bringing professional-grade image editing capabilities to AWS infrastructure.
Image Services are image editing tools packaged as API services. Designed to support end-to-end creative workflows, our Image Services enable granular editing control with actions like inpainting, recoloring a specific object, or transferring a style to another image.
The suite of Image Services tools is now available on Amazon Bedrock.
Today we’re expanding our partnership with Amazon Web Services to bring our Image Services to Amazon Bedrock. Image Services are advanced image editing tools, such as Inpaint, Erase, and Remove Background, that are delivered as managed API services that developers and businesses can easily integrate into enterprise-grade applications.
Professional image editing is not a one-and-done input and output. Creative teams need the ability to go beyond a single generation and iteratively refine visual content to meet exact specifications. Our Image Services provide a range of editing capabilities that give full control over the multi-step creative process.
The suite of Image Services complements the image generation capabilities already available on Bedrock, including Stable Diffusion 3.5, Stable Image Core, and Stable Image Ultra. Enterprises can now support end-to-end generation and editing workflows with powerful AI tools on AWS.
By continuing to make our tools available on AWS infrastructure, enterprises get both cutting-edge AI capabilities and the security, reliability, and scale that production environments require. Large enterprise customers like Mercado Libre and HubSpot are leveraging our image generation and editing API services on Bedrock today to power their production use cases.
Get granular control over the image editing process
We developed our Image Services based on a deep understanding of how image editing workflows work in production: a series of steps to get it exactly right.
The tools in our Image Services suite are designed to support the end-to-end creative process from concepting and ideation, to creating finished lifestyle photography. Instead of changing the entire image each time, creative teams can start with an idea or an image, and evolve it to meet their precise needs.
Nine editing tools are now available as API services on Bedrock. These tools support two general types of image editing workflow: Edit and Control.
Edit: Focuses on precise, targeted modifications to existing images without altering the overall composition or structure. These tools are designed for professional retouching and content adaptation workflows. The tools include Inpaint, Erase Object, Remove Background, Search and Replace, and Search and Recolor.
With Inpaint, you can fill in or replace specified areas with new content based on the content of a mask image. This is especially valuable for use cases like product photography, which often requires adding different products into existing scenes.
Search and Recolor changes object colors while preserving the background; for example, you can generate chair color options to show in a product catalog.
Control: Tools for generating controlled variations of images, such as turning a sketch into a photorealistic product shot, or applying a new style to an image while preserving the structure of the subject. The tools in the Control category include Structure, Style Transfer, Style Guide, and Sketch.
The Structure tool maintains the structural elements of input images while allowing content modification. This tool preserves layouts, compositions, and spatial relationships while changing subjects or styles, useful for recreating scenes with different subjects.
The Sketch tool transforms sketch renderings into photorealistic concepts. Architecture firms might use this to convert drawings into realistic visualizations, and apparel brands to turn design sketches into product mockups.
Leverage fully managed, enterprise-grade infrastructure
Amazon Bedrock's fully managed service architecture allows organizations to integrate Stability AI Image Services into existing workflows without managing infrastructure complexity. Teams can access, test, and deploy professional-grade image editing capabilities while maintaining enterprise security and compliance standards.
Stability AI Image Services are delivered through API endpoints that integrate directly with existing content management systems, digital asset management platforms, and creative pipelines. This API-first approach enables developers to embed sophisticated image editing capabilities into applications without rebuilding core infrastructure.
Get started
Stability AI Image Services are now available on Amazon Bedrock. To implement these API services in your Amazon Bedrock environment, make sure to enable Stability AI model access in your Bedrock console and set up IAM permissions with the appropriate access policies for image processing.
For enterprises who need customization and support integrating image generation and editing tools into their production workflows, visit Stability AI Solutions to learn more, or chat with an expert here.
To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.
Original source Report a problem - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 16, 2026
Introducing Brand Studio: The creative production platform powered by your brand
Stability AI introduces Brand Studio, an enterprise creative production platform for on-brand content at scale. It adds Brand Central for custom Brand ID models and Campaigns, Producer Mode for step-by-step execution, curated model routing, and precision editing tools.
Key Takeaways
Brand Studio by Stability AI is the creative production platform for professional teams, powered by your brand. Get started here.
- Customize for your brand: Creatives can build their brand identity directly into the platform with deep customization options in Brand Central, including custom Brand ID models and Campaigns that ensure outputs follow brand guidelines.
- Scale production: Turn prompts into step-by-step production plans and execute the plan with Producer Mode. Curated Model Routing automatically chooses the best models for your use case – including your Brand ID models and select industry-leading models.
- Create with precision: Make targeted edits, like placing a product into a scene without changing anything else, with new tools like Precision Inpainting designed for teams who need every element to land exactly right.
Today we’re excited to introduce Brand Studio by Stability AI, the end-to-end creative production platform powered by your brand.
The AI industry is building tools for everyone. But your brand isn’t everyone. It’s everything. Which is why the off-the-shelf AI tools aren’t working for you.
So instead of yet another AI tool that’s supposed to work for everyone, we built a creative platform that works just for you.
It’s time to put your brand first. Get started now:
GET STARTED HERE
Bring your brand identity into the platform with Brand Central
For enterprise teams, Brand Central is the hub where you create and manage all customizations for different brand needs, including:
- Brand ID models: Custom models trained on everything that makes up your brand identity, such as photography style, color palette, design motifs, logo placement, and composition. Enterprise teams can train a Brand ID model in-house using the self-service feature in Brand Studio, or partner with our applied research team for additional support.
- Campaigns: You can build your own Campaigns in Brand Studio by bringing your creative mandatories and guidelines into the platform as reference images. A Campaign can be designed for a specific audience, market, or season. Once you build a Campaign, Brand Studio creates a one-click option that your team can select to create assets for that campaign.
We also offer custom workflows (a combination of models and tools), developed in partnership with our team as a product feature built just for your brand. For instance, a custom product try-on workflow enables an ecommerce brand to place specific product SKUs exactly the same way on different people for their product detail page (PDP), using the same built-in steps each time to keep the outputs consistent.
In Brand Central, you can create and manage Brand ID models and Campaigns. When prompting, select the Brand ID Model and specific Campaign you want to create assets for.
Leading creative teams like Huge, a digital marketing agency that works with major global brands, are already partnering with us to create Brand ID models and workflows in Brand Studio.
“Most AI tools right now are built for speed, not craft, and it shows in the outputs. What drew us to Stability AI is that they think like makers,” said Ez Blaine, Chief Creative Officer, Huge. “They’re also genuinely flexible partners — not off-the-shelf, not rigid. If we need to go deep into the code and build something custom, they’re game for that.”
Create faster with Producer Mode, your partner in production
Know what you want to generate, but not sure which tools and editing steps will get you there?
Describe what you need, and Producer Mode builds a step-by-step plan to get there. Once you approve the plan, Producer Mode gathers the right resources and executes the plan. At each stage, you can evaluate results and re-generate specific steps as needed.
To build the plan, Producer Mode references everything in Brand Central – including relevant Brand ID models and Campaigns – as well as the best tools for the use case.
Don’t use just any model, use the right one for your creative use cases
Brand Studio employs Curated Model Routing to intelligently select the most capable model option – so you can get the best output instead of spending time and credits testing multiple models across fragmented tools to find what will work.
Our team evaluates each model according to its performance for specific marketing and advertising requirements, such as brand consistency and style alignment, product accuracy, text rendering, and audience relevance (whether the image’s content and style are appropriate for the target audience). Based on this criteria we select the best of our models and third-party providers, including Stable Diffusion, Nano Banana, Seedream, and more.
Curated Model Routing works behind the scenes whether you’re executing step-by-step, or working in Producer Mode. If you have a preferred model, you can also toggle off Curated Model Routing to select it.
Change exactly what you meant with precision editing tools
Professional creative teams need to make highly targeted, pixel-perfect edits, like placing a product into a scene or swapping an element while preserving the rest of the image – so that other things don’t change when you make one edit. Brand Studio includes new tools that are developed for these use cases:
- Precision Inpainting: Go beyond standard mask-and-replace inpainting. You define the region you want to change with a guide layer that specifies exactly what you want placed. You also have the option to pick up the brush and use a sketch to create the guide.
- Product Insertion: Place a product into a scene, and the context-aware Product Insertion tool handles realistic integration with the environment around it.
Manage creative production in an enterprise-ready workspace
Brand Studio is designed to meet the requirements of enterprise teams with comprehensive governance and team management capabilities, including:
- Single sign-on (SSO) and access controls: Define project access permissions by role, with granular control over who can view or work in specific projects.
- Collaboration features: Review, comment and approve visual content together with direct commenting and annotation on outputs.
Because Brand Studio is a fully managed platform, updates and maintenance are handled for you, so your team has access to the latest features and models without any lift on your end.
Get your Brand Studio started now
Brand Studio offers two plans: the Core plan for creative professionals who want access to powerful AI capabilities and precise control; and the Enterprise plan for teams who need to create on-brand content at scale with deeper customization capabilities.
Try our Core tier now for free here, or get in touch with our team for Enterprise.
YOUR BRAND STUDIO IS WAITING. START CREATING NOW.
Original source Report a problem