Meta Release Notes
182 release notes curated from 148 sources by the Releasebot Team. Last updated: May 15, 2026
Meta Products
- May 15, 2026
- Date parsed from source:May 15, 2026
- First seen by Releasebot:May 15, 2026
May 15, 2026
Instagram Platform now supports oEmbed API calls without an access token for all versions.
oEmbed API
Applies to all versions.
You can now call Instagram oEmbed API without an access token. See Embed an Instagram Post for more details.
Original source - May 13, 2026
- Date parsed from source:May 13, 2026
- First seen by Releasebot:May 13, 2026
Introducing Instants: A New Way to Share in the Moment
Instagram introduces Instants, a new way to share real-time photos with Close Friends or mutual followers that disappear after being viewed or after 24 hours. It adds archive, recap to Stories, undo, snooze, and built-in safety and privacy controls.
Today, we’re introducing Instants, a new way to share photos in the moment with your Close Friends or mutual followers with just a tap. Photos you share on Instants disappear after they’ve been viewed and can’t be viewed after 24 hours. You also can’t edit your instants before sharing, so you can share authentic moments as they’re happening.
How It Works
You can capture an instant by tapping the mini pile of photos at the bottom right corner of your Instagram inbox or by opening the Instants app. From there, snap a photo in real time — no uploads from your phone’s photo gallery. You can also add a caption to your photo, but can’t make any further edits to it.
Next, choose who you want to share your Instant with — your Close Friends or followers that you follow back. Recipients can react with emojis, reply, and send instants back to you. The instants you share will show up as a pile of photos in your friends’ inboxes and disappear once they’re viewed.
The Instants app, rolling out in select countries on iOS and Android, gives you immediate access to the Instants camera — just log in with your Instagram account to get started. Instants work the same across both apps, meaning instants shared via the app will reach your friends seeing them on Instagram, too.
Features to Make Sharing With Friends Simple
- Archive: Your shared instants are saved in a private archive that only you can see for up to one year, which you can access on the top right corner of Instants.
- Recap to Stories: Compile instants from your archive into a recap and post it to Instagram Stories for your followers. Just tap Create recap in your archive to get started.
- No screenshots: Friends can’t screenshot or record instants you share.
- Undo: Accidentally sent an instant? Quickly take it back before friends see it by tapping the undo button. You can also delete an instant from your archive to unsend it to friends who haven’t opened it yet.
- Snooze: Hold down the pile of instants in your inbox and swipe right to temporarily stop receiving them. Bring back instants by holding down the same spot and swiping left.
Built-In Safety and Privacy
All of the safety and privacy protections on Instagram apply to Instants. Instagram’s in-app controls like Block, Mute, and Restrict work on Instants, too, so you can limit receiving instants from specific friends. Your instants can only be seen by those you choose to share them with — your close friends or followers you follow back.
For teens, Instants is automatically integrated with Teen Accounts and Family Center. There is no separate setup — if a parent already supervises their teen on Instagram, that supervision automatically extends to Instants.
Notable protections include:
- Shared time limits: Time spent on Instants counts toward a teen’s daily time limit on Instagram.
- Sleep Mode: Notifications are muted and access is restricted by default between 10PM and 7AM for teens.
- Safety tools: Instagram’s safety features — including block, mute, restrict, content filters, and reporting — all work on Instants.
- Parent notification: Parents of supervised teens will be notified the first time their teen downloads the Instants app.
Instants is available globally starting today as a feature on Instagram and as an app in select countries.
Original source All of your release notes in one feed
Join Releasebot and get updates from Meta and hundreds of other software products.
- May 13, 2026
- Date parsed from source:May 13, 2026
- First seen by Releasebot:May 13, 2026
Introducing Incognito Chat with Meta AI: A completely private way to chat with AI
WhatsApp launches Incognito Chat with Meta AI, bringing private, temporary AI conversations that aren’t saved and disappear by default. It also teases Side Chat protected by Private Processing for private help in chats, rolling out over the coming months.
Chatting with AI has quickly become a critical part of how people get information and ask important questions. And many of these questions can be deeply sensitive, or include situations where people are including private financial, personal, health or work data with their questions.
Ten years ago we brought the world end-to-end encryption and now we are extending this privacy to chats with Meta AI.
Today we're launching Incognito Chat with Meta AI, a new way to have completely private conversations with AI. Built on top of our Private Processing technology, Incognito Chat lets you talk to Meta AI in a way that is invisible to anyone else.
Other apps have introduced incognito-style modes, but they can still see the questions coming in and the answers going out. Incognito Chat with Meta AI is truly private — no one can read your conversation, not even us.
Since we started exploring bringing AI to WhatsApp, we've been focused on how to deliver this power privately, at a global scale.
When you start an Incognito Chat with Meta AI, you're creating a private, temporary conversation that only you can see. Your messages are processed in a secure environment that even Meta cannot access. Your conversations are not saved and by default, your messages disappear — giving you a space to think and explore ideas without anyone watching.
We believe this private way of chatting has potential to be part of several ways people chat with AI on WhatsApp. In the coming months, we’ll also introduce Side Chat protected by Private Processing. Side Chat with Meta AI will give you private help with any chat, with context of what's being discussed, without disrupting the main conversation.
We remain committed to delivering privacy for the world. Incognito Chat with Meta AI is rolling out on WhatsApp and the Meta AI app over the coming months. You can learn more about how Incognito Chat with Meta AI works here.
Original source - May 13, 2026
- Date parsed from source:May 13, 2026
- First seen by Releasebot:May 13, 2026
Introducing a Completely Private Way to Chat With AI
WhatsApp launches Incognito Chat with Meta AI, bringing a completely private way to chat with AI on WhatsApp and the Meta AI app. Conversations are processed in a secure environment, not saved by default, and disappear for temporary, invisible chats.
We’re launching Incognito Chat with Meta AI on WhatsApp and the Meta AI app, a completely private way to interact with AI.
Your Incognito Chat conversations are processed in a secure environment that even Meta can’t see, and disappear by default.
Chatting with AI has quickly become a critical part of how people get information and ask important questions. These questions can be deeply sensitive or personal, like health issues, loan details, or career advice.
Today, we’re launching Incognito Chat with Meta AI on WhatsApp and the Meta AI app, a new way to have completely private conversations with AI. Built on top of WhatsApp’s Private Processing technology, Incognito Chat lets you talk to Meta AI in a way that is invisible to anyone else.
Other apps have introduced incognito-style modes, but they can still see the questions coming in and the answers going out. Incognito Chat with Meta AI is truly private, meaning no one — not even Meta — can read your conversations.
When you start an Incognito Chat with Meta AI on WhatsApp, you’re creating a private, temporary conversation that only you can see. Your messages are processed in a secure environment that even Meta cannot access. Your conversations are not saved and by default, your messages disappear — giving you space to ask questions and explore ideas without anyone watching.
We believe this private way of chatting has potential to be part of several ways people chat with AI on WhatsApp. In the coming months, we’ll also introduce Sidechat protected by Private Processing on WhatsApp. Side Chat with Meta AI will give you private help with any WhatsApp chat with context of what’s being discussed, without disrupting the main conversation.
Incognito Chat with Meta AI is rolling out on WhatsApp and the Meta AI app over the coming months. You can learn more about how Incognito Chat with Meta AI works here.
Original source - May 6, 2026
- Date parsed from source:May 6, 2026
- First seen by Releasebot:May 7, 2026
May 6, 2026
Instagram Platform brings multiple image sending out of beta to all accounts.
Sending multiple images is now out of beta and available to all accounts.
Original source - May 6, 2026
- Date parsed from source:May 6, 2026
- First seen by Releasebot:May 6, 2026
19.2.6 (May 6th, 2026)
React improves Server Components with type hardening and performance boosts.
React Server Components
Type hardening and performance improvements
(#36425 by @eps1lon and @unstubbable)
Original source - May 6, 2026
- Date parsed from source:May 6, 2026
- First seen by Releasebot:May 6, 2026
19.1.7 (May 6th, 2026)
React adds Server Components type hardening and performance improvements.
React Server Components
Type hardening and performance improvements
(#36425 by @eps1lon and @unstubbable)
Original source - May 6, 2026
- Date parsed from source:May 6, 2026
- First seen by Releasebot:May 6, 2026
19.0.6 (May 6th, 2026)
React ships Server Components type hardening and performance improvements.
React Server Components
Type hardening and performance improvements
(#36425 by @eps1lon and @unstubbable)
Original source - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 23, 2026
Segment Anything 2 Demo
Meta AI launches Segment Anything 2 demo for video cutouts and effects with a few clicks.
- April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 23, 2026
FAIRChem v2
Meta AI reports FAIRChem v2 introduces UMA, a universal machine learning potential with state-of-the-art accuracy.
FAIRChem v2 introduces the UMA model — a universal machine learning potential for atoms. This is a breaking change from v1 and is not compatible with previous pretrained models.
UMA is trained on 500M+ DFT calculations across molecules, materials, and catalysts — achieving state-of-the-art accuracy with energy conservation and fast inference.
Original source - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 23, 2026
Seamless Communication
Meta AI releases Seamless Communication, a suite of AI translation models that aims to make cross-language speech more natural, expressive and fast. It includes SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2, and is publicly releasing the models, data and tools.
AI research by Meta
Seamless Communication
A significant step towards removing language barriers through expressive, fast and high-quality AI translation
A family of AI research models that enable more natural and authentic communication across languages
The Seamless Communication models
SeamlessExpressive
A model that aims to preserve expression and intricacies of speech across languages.
SeamlessStreaming
A model that can deliver speech and text translations with around two seconds of latency.
SeamlessM4T v2
A foundational multilingual and multitask model that allows people to communicate effortlessly through speech and text.
Seamless
A model that merges capabilities from SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 into one.
Preserving prosody
SeamlessExpressive
Translations should capture the nuances of human expression. While existing translation tools are skilled at capturing the content within a conversation, they typically rely on monotone, robotic text-to-speech systems for their output. SeamlessExpressive aims to preserve intricacies of speech; such as pauses and speech rate, in addition to vocal style and emotional tone.
Try the SeamlessExpressive demo
English input: whisper
Please keep the volume down. We just put the baby to sleep.
Spanish output: non-expressive
Spanish output: expressive
English input: sad
Please, don't leave. I hate being here alone.
French output: non-expressive
French output: expressive
Near real-time translation
SeamlessStreaming
SeamlessStreaming is the first massively multilingual model that delivers translations with around two-seconds of latency and nearly the same accuracy as an offline model. Built upon SeamlessM4T v2, SeamlessStreaming supports automatic speech recognition and speech-to-text translation for nearly 100 input and output languages, in addition to speech-to-speech translation for nearly 100 input languages and 36 output languages.
Foundational model for universal translation
SeamlessM4T v2
In August 2023, we introduced the first version of SeamlessM4T, a foundational multilingual and multitask model that delivered state-of-the-art results for translation and transcription across speech and text. Built upon this work, our improved model, SeamlessM4T v2, serves as the foundation for our new SeamlessExpressive and SeamlessStreaming models. It features a new architecture with a non-autoregressive text to unit decoder that delivers improved consistency between text and speech output.
More model details
Learn more about the research behind Seamless Communication
Try the SeamlessExpressive demo
Try the SeamlessExpressive demo to hear how you sound in a different language while maintaining elements of your expression and tone.
Our approach to research
Open innovation
We believe in the power of collaboration and open research to break down communication barriers. To enable our fellow researchers to build upon this work, we’re publicly releasing the full suite of Seamless Communication models, along with metadata, data and tools.
Safety and responsibility
We’re dedicated to promoting a safe and responsible AI ecosystem. We have taken a number of steps to improve the safety of our Seamless Communication models; significantly reducing the impacts of hallucinated toxicity in translations, and implementing a custom watermarking approach for audio outputs from our expressive models.
Resources
More on Seamless Communication
Explore additional resources, including the research paper, model details and more.
Technical overview
More details on how we developed the suite of Seamless Communication models.
Seamless research paper
Methodology, benchmarks, research findings and more from the Seamless Communication project.
AI at Meta blog
Read the full post about the journey, research and milestones achieved.
Download the models
Get access to our suite of publicly available models.
SeamlessExpressive Demo
Hear how you sound in a different language while maintaining elements of your expression and tone.
Original source - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 23, 2026
Meta Video Seal
Meta AI introduces Video Seal, an open-source video watermarking model that embeds durable, invisible watermarks and hidden messages to help verify video origin even after editing.
Introducing Meta Video Seal
A state-of-the-art, open-source model for video watermarking
With AI-generated content on the rise, verifying video origins is crucial. Video Seal is a neural watermarking model that embeds durable, invisible watermarks - even after video editing.
Imperceptible watermarks
Video Seal embeds an invisible watermark into videos, with the option to include a hidden message.
Robust and Resilient
Video Seal's watermarks are resilient, withstanding distortion efforts such as flipping and blurring.
Origin Verification
The watermark and hidden message can be revealed to verify the video's origin.
How the demo works
- Choose a video from the library to explore the model, or upload your own to get started.
- Embed up to a 6-character hidden message and watermark in your video.
- Use the comparison slider to view an enhanced X-ray visualization of the watermark on the video.
- Stress test the watermark by distorting the video and verifying if the watermark and hidden message remain detectable.
- April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 23, 2026
Introducing Meta Motivo
Meta AI releases Meta Motivo, a behavioral foundation model for zero-shot control of a virtual physics-based humanoid. It also adds a new humanoid benchmark, training code, and a demo, with strong whole-body task performance across motion tracking, pose reaching, and reward optimization.
A Meta FAIR release
Introducing Meta Motivo
A first-of-its-kind behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.
Try the demo
Download the model
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models
Meta Motivo is a behavioral foundation model pre-trained with a novel unsupervised reinforcement learning algorithm to control the movements of a complex virtual humanoid agent. At test time, our model can be prompted to solve unseen tasks such as motion tracking, pose reaching, and reward optimization without any additional learning or fine-tuning.
Read the research paper
Physics-based environment
The model has learned to control the agent, subject to the physics of its body and environment. Its behaviors are robust to variations and perturbations.
Different prompts for behaviors
The model can be prompted with motions to track, poses to reach, and rewards to optimize.
Zero-shot capability
The model computes the best behavior for each prompt without any additional learning or fine-tuning.
Explore the Research
We are releasing the pre-trained model together with the new humanoid benchmark and the training code. We hope this will encourage the community to further develop research towards building behavioral foundation models that can generalize to more complex tasks, and potentially different types of agents.
Key takeaways
- We introduce a new algorithm grounding the forward-backward unsupervised reinforcement learning method with an imitation objective leveraging a dataset of unsupervised trajectories.
- With this new approach, we train Meta Motivo, a behavioral foundation model that controls a high-dimensional virtual humanoid agent to solve a wide range of tasks.
- We evaluated our model using a new humanoid benchmark across motion tracking, pose reaching, and motion tracking tasks. Meta Motivo achieved competitive performance with task-specific methods, while outperforming state-of-the-art unsupervised RL and model-based baselines.
The Algorithm
Forward-Backward representations with Conditional Policy Regularization (FB-CPR) is a novel algorithm combining unsupervised forward-backward representations [1, 2, 3] with an imitation learning loss regularizing policies to cover states observed in a dataset of unlabeled trajectories. Our algorithm is trained online through direct access to the environment and it crucially learns a representation that aligns the embedding of states, motions, and rewards into the same latent space. As a result, we can train models whose policies are grounded towards useful behaviors, while being capable of zero-shot inference across a wide range of tasks, such as goal-based RL, imitation learning, reward optimization, and tracking.
The final model includes two components: 1) an embedding network that receives as input the state of the agent and it returns its embedding; 2) a policy network parameterized with the same embedding that receives an input the state and returns the action to take.
Inference from various types of prompts
Our algorithm learns a representation that aligns states, rewards, and policies into the same latent space. We can then leverage this representation to perform zero-shot inference for different tasks
Motion tracking
Pose reaching
Reward optimization
Performance improvement during pre-training
Meta Motivo is a behavioral foundation model trained on a SMPL-based humanoid simulated with the Mujoco simulator using a subset of the AMASS motion capture dataset and 30 million online interaction samples.
The videos below illustrate the behaviors corresponding to one motion tracking task (a cartwheel motion), one pose reaching task (an arabesque pose), and one reward optimization task (running) at different stages of the pre-training process. Despite the model not being explicitly trained to optimize any of these tasks, we see the performance improving during training and more human-like behaviors emerge.
Motion tracking
Pose reaching
Reward optimization
Evaluation Results
For evaluation, we have developed a new humanoid benchmark including motions to track, stable poses to reach, and reward functions to optimize. We consider several different baselines including 1) methods that are retrained for each task separately; 2) behavioral foundation models and model-based algorithms. We are releasing the code with the specification files needed to use the simulator and evaluate the model performance on the tasks that are used in the paper.
Quantitative
Our model achieves between 61% to 88% of the performance of top-line methods retrained for each task, while outperforming all other algorithms except for the tracking: in this case it is second best behind Goal-TD3, which cannot be used for reward-based tasks.
Results
Motion tracking
Pose reaching
Reward optimization
Qualitative
To further analyze the performance gap in reward-based and goal-based tasks between Meta Motivo and single-task TD3, we ran a human evaluation with the objective of having a qualitative assessment of the learned behaviors in terms of human-likeness. This evaluation reveals that policies purely optimized for performance (TD3) produce much less natural behaviors than Meta Motivo, which better trades off performance and qualitative behaviors.
Results
Pose reaching
Reward optimization
Understanding the behavioral latent space
One of the crucial aspects of our new algorithm is that it uses the same representation to embed states, rewards, and motions in the same space. We have then investigated the structure of the learned behavioral latent space.
Visualization
Interpolation
In the image above, we visualize the embedding of motions classified by their activity (e.g., jumping, running, crawling) and reward-based tasks. Not only does the representation capture semantically similar motions in similar clusters, but it creates a latent space where rewards and motions are well aligned.
Limitations
Meta Motivo is our first attempt to train behavioral foundation models with zero-shot capabilities across several different prompt types. While the model achieved strong quantitative and qualitative results, it still suffers from several limitations.
Motion tracking
Pose reaching
Reward optimization
Fast movements and motions on the ground are poorly tracked. The model also exhibits unnatural jittering.
Try it yourself
Control the behavior of an embodied virtual agent through various prompts, including creating your own! See how the agent adjusts to changes in physics and environmental conditions, like gravity and wind.
Try the demo
References
- Ahmed Touati, Yann Ollivier, Learning One Representation to Optimize All Rewards, NeurIPS 2021
- Ahmed Touati, Jérémy Rapin, Yann Ollivier, Does Zero-shot Reinforcement Learning Exist?, ICLR 2023
- Matteo Pirotta, Andrea Tirinzoni, Ahmed Touati, Alessandro Lazaric, Yann Ollivier, Fast Imitation via Behavior Foundation Models, ICLR 2024
- Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black, SMPL: a skinned multi-person linear model, ACM Transactions on Graphics 2015.
- MuJoCo - Advanced physics simulation
- Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, and Michael J. Black. AMASS: archive of motion capture as surface shapes, ICCV 2019.
- https://github.com/facebookresearch/humenv
Acknowledgements
Research Authors
Andrea Tirinzoni, Ahmed Touati, Jesse Farebrother, Mateusz Guzek, Anssi Kanervisto, Yingchen Xu, Alessandro Lazaric, Matteo Pirotta
Project Contributors (alphabetical)
Claire Roberts, Dominic Burt, Jiemin Zhang, Leonel Sentana, Maria Ruiz, Matt Hanson, Morteza Behrooz, Ryan Winstead, Spaso Ilievski, Vincent Moens, Vlad Bodurov, William Ngan
© 2024 Meta
Original source - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 23, 2026
DINOv3
Meta AI releases DINOv3, a self-supervised vision foundation model that brings stronger universal backbones, dense image features, and broad performance across detection, segmentation, depth estimation, and tracking. It also expands the model suite with efficient options for diverse deployment needs.
INTRODUCING DINOV3
Self-supervised learning for vision at unprecedented scale
DINOv3 scales self-supervised learning (SSL) for images to produce our strongest universal vision backbones, enabling breakthrough performance across diverse domains.
Download DINOv3
Read the research paper
DINOV3 OVERVIEW
Cutting-edge image representations, trained without human supervision
We scaled unsupervised training to 7B-parameter models and 1.7B image datasets, using a fraction of compute compared to weakly-supervised methods. Despite keeping backbones frozen during evaluation, they achieve absolute state-of-the-art performance across diverse domains.
Read the research paper
Exceptional performance across visual domains
SSL unlocks domains where annotations are scarce or costly. Backbones enable state-of-the-art results for tasks including object detection in web imagery, but also canopy height mapping in satellite and aerial imagery.
Versatile backbone with powerful dense image features
High-resolution dense features from a single DINOv3 backbone enable leading performance across vision tasks, including object detection, depth estimation, and segmentation, without any finetuning.
Efficient model sizes and architectures
We release a comprehensive model suite addressing a wide range of use cases, including broad coverage of ViT sizes and efficient ConvNeXt models for on-device deployment.
PERFORMANCE
Evaluating DINOv3's Performance
DINOv3 sets a new standard in vision foundation models. For the first time, a model trained with SSL outperforms weakly-supervised models on a broad range of probing tasks, from fine-grained image classification, to semantic segmentation, to object tracking in video.
APPLICATIONS
DINO in action
From challenging annotation scenarios to efficiency-critical deployments, see how researchers and developers use DINO to build breakthrough applications.
Download DINOv3
World Resources Institute
WRI measures tree canopy heights with DINO, helping civil society organizations worldwide monitor reforestation.
Learn more
NASA JPL
NASA JPL uses DINO for Mars exploration robots, enabling multiple vision tasks with minimal compute.
Learn more
Orakl Oncology & CentraleSupelec
Orakl Oncology & CentraleSupelec pre-trains DINO on organoid images, producing a backbone to power prediction of patient responses to cancer treatments.
Learn more
APPROACH
Self-supervised pre-training unlocks simple task adaptation
Pre-training data is curated from a large unlabeled dataset. During pre-training, the model learns general-purpose visual representations, matching features between different augmented views of the same image. In post-training, the model is distilled into more efficient models.
A pre-trained DINOv3 model can be easily tailored by training a lightweight adapter on a small amount of annotated data.
DINO Evolution
DINOv3 marks a new milestone in self-supervised training at scale. It builds upon the scaling progress of DINOv2, further increasing the model size x6, and training data x12.
DINO
Initial research proof-of-concept, with 80M-parameter models trained on 1M images.
Read the research paper
Download the model
DINOv2
First successful scaling of a SSL algorithm. 1B-parameter models trained on 142M images.
Read the research paper
Download the model
DINOv3
An order of magnitude larger training compared to v2, with particular focus on dense features.
Read the research paper
Download the model
Explore additional resources
Read the AI at Meta blog
Read the research paper
Download DINOv3
DINOv3 on Hugging Face
Original source - April 2026
- No date parsed from source.
- First seen by Releasebot:Apr 23, 2026
Introducing Meta Segment Anything Model Audio (SAM Audio)
Meta AI launches SAM Audio, a multimodal sound separation model that uses text, visual, and span prompts to isolate target audio from complex mixes. It also adds PE-AV to Perception Encoder and releases a new OSS evaluation set with a judge model.
With SAM Audio, you can use simple text prompts to accurately separate any sound from any audio or audio-visual source.
SAM AUDIO CAPABILITIES
SAM Audio separates target and residual sounds from any audio or audiovisual source—across general sound, music, and speech.
Text prompts
SAM Audio enables you to use text-based prompts to describe the specific target audio they want to separate.
Visual prompts
SAM Audio lets you pick out and separate sounds by clicking on the part of the video where you hear them.
Span prompts
SAM Audio is the first model to introduce span prompting, selecting the desired point in the timespan that contains the target audio.
Multi-modal prompts
SAM Audio provides you flexibility with three unifying prompt modalities (text, visual, timespan).
A NEW WAY TO EXPERIENCE SOUND
State-of-the-art model for all sound
SAM Audio is a state-of-the-art, unified multimodal model that sets a new standard for audio separation, enabling users to isolate general sounds, music, and speech from complex mixtures using intuitive prompts.
PERFORMANCE
State-of-the-art model performance
SAM Audio achieves beyond state-of-the-art performance for all prompting capabilities.
OUR APPROACH
Model architecture
SAM Audio is a generative separation model that extracts both target and residual stems from an audio mixture using text, visual, or temporal prompts. It is powered by a flow-matching Diffusion Transformer and operates in a DAC-VAE latent space, enabling high-quality joint generation of target and residual audio.
OUR APPROACH
Audiovisual Perception Encoder
PE-AV is a new open source model, bringing audio capabilities to Meta's Perception Encoder.
THE SAM AUDIO EVALUATION DATASET
A first-of-its-kind audio separation OSS evaluation set
SAM Audio is releasing a first-of-its-kind OSS evaluation set for prompted audio separation and a judge model highly correlated with human subjective evaluation.
Real world opportunities
"Artificial Intelligence has been a game changer for the disabled community and the use cases for AI-focused start-ups in our ecosystem are vast. By incorporating open source models like SAM Audio into their work, 2GI’s cohort participants can advance their missions while gaining competitive advantage, showcasing that disabled founders are on the cutting edge of technology."
- Diego Mariscal, CEO of 2gether-International
2gether-International empowers disabled founders with resources to launch high-impact startups. In partnership with Meta’s AI for Good team, 2GI leverages open AI models like SAM Audio to accelerate innovation for early-stage, founder-led AI companies.
"For years, Starkey has led the industry in applying artificial intelligence to revolutionize hearing technology. Our ground-breaking work continues to elevate what hearing aids can achieve, particularly in challenging listening situations like noisy environments and overlapping speech. With open models like SAM audio, we see tremendous opportunity to build on our innovations and further our mission to help people hear better and live better."
- Achin Bhowmik, Chief Technology Officer and Executive Vice President of Engineering at Starkey
Starkey is the global leader in hearing technology and the only global American-owned hearing aid manufacturer. Using AI, Starkey transforms hearing aids into smart health and communication devices—delivering innovative, connected solutions that enhance lives
Original source
Curated by the Releasebot team
Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.
Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.
Similar to Meta with recent updates:
- xAI release notes74 release notes · Latest May 21, 2026
- Cursor release notes84 release notes · Latest May 20, 2026
- Perplexity release notes24 release notes · Latest May 11, 2026
- OpenAI release notes682 release notes · Latest May 21, 2026
- Notion release notes128 release notes · Latest May 15, 2026
- Anthropic release notes574 release notes · Latest May 22, 2026