Meta Release Notes

182 release notes curated from 148 sources by the Releasebot Team. Last updated: May 15, 2026

Get this feed:

Meta Products

  • May 15, 2026
    • Date parsed from source:
      May 15, 2026
    • First seen by Releasebot:
      May 15, 2026
    Meta logo

    Instagram Platform by Meta

    May 15, 2026

    Instagram Platform now supports oEmbed API calls without an access token for all versions.

    oEmbed API

    Applies to all versions.

    You can now call Instagram oEmbed API without an access token. See Embed an Instagram Post for more details.

    Original source
  • May 13, 2026
    • Date parsed from source:
      May 13, 2026
    • First seen by Releasebot:
      May 13, 2026
    Meta logo

    Instagram by Meta

    Introducing Instants: A New Way to Share in the Moment

    Instagram introduces Instants, a new way to share real-time photos with Close Friends or mutual followers that disappear after being viewed or after 24 hours. It adds archive, recap to Stories, undo, snooze, and built-in safety and privacy controls.

    Today, we’re introducing Instants, a new way to share photos in the moment with your Close Friends or mutual followers with just a tap. Photos you share on Instants disappear after they’ve been viewed and can’t be viewed after 24 hours. You also can’t edit your instants before sharing, so you can share authentic moments as they’re happening.

    How It Works

    You can capture an instant by tapping the mini pile of photos at the bottom right corner of your Instagram inbox or by opening the Instants app. From there, snap a photo in real time — no uploads from your phone’s photo gallery. You can also add a caption to your photo, but can’t make any further edits to it.

    Next, choose who you want to share your Instant with — your Close Friends or followers that you follow back. Recipients can react with emojis, reply, and send instants back to you. The instants you share will show up as a pile of photos in your friends’ inboxes and disappear once they’re viewed.

    The Instants app, rolling out in select countries on iOS and Android, gives you immediate access to the Instants camera — just log in with your Instagram account to get started. Instants work the same across both apps, meaning instants shared via the app will reach your friends seeing them on Instagram, too.

    Features to Make Sharing With Friends Simple

    • Archive: Your shared instants are saved in a private archive that only you can see for up to one year, which you can access on the top right corner of Instants.
    • Recap to Stories: Compile instants from your archive into a recap and post it to Instagram Stories for your followers. Just tap Create recap in your archive to get started.
    • No screenshots: Friends can’t screenshot or record instants you share.
    • Undo: Accidentally sent an instant? Quickly take it back before friends see it by tapping the undo button. You can also delete an instant from your archive to unsend it to friends who haven’t opened it yet.
    • Snooze: Hold down the pile of instants in your inbox and swipe right to temporarily stop receiving them. Bring back instants by holding down the same spot and swiping left.

    Built-In Safety and Privacy

    All of the safety and privacy protections on Instagram apply to Instants. Instagram’s in-app controls like Block, Mute, and Restrict work on Instants, too, so you can limit receiving instants from specific friends. Your instants can only be seen by those you choose to share them with — your close friends or followers you follow back.

    For teens, Instants is automatically integrated with Teen Accounts and Family Center. There is no separate setup — if a parent already supervises their teen on Instagram, that supervision automatically extends to Instants.

    Notable protections include:

    • Shared time limits: Time spent on Instants counts toward a teen’s daily time limit on Instagram.
    • Sleep Mode: Notifications are muted and access is restricted by default between 10PM and 7AM for teens.
    • Safety tools: Instagram’s safety features — including block, mute, restrict, content filters, and reporting — all work on Instants.
    • Parent notification: Parents of supervised teens will be notified the first time their teen downloads the Instants app.

    Instants is available globally starting today as a feature on Instagram and as an app in select countries.

    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from Meta and hundreds of other software products.

    Create account
  • May 13, 2026
    • Date parsed from source:
      May 13, 2026
    • First seen by Releasebot:
      May 13, 2026
    Meta logo

    WhatsApp by Meta

    Introducing Incognito Chat with Meta AI: A completely private way to chat with AI

    WhatsApp launches Incognito Chat with Meta AI, bringing private, temporary AI conversations that aren’t saved and disappear by default. It also teases Side Chat protected by Private Processing for private help in chats, rolling out over the coming months.

    Chatting with AI has quickly become a critical part of how people get information and ask important questions. And many of these questions can be deeply sensitive, or include situations where people are including private financial, personal, health or work data with their questions.

    Ten years ago we brought the world end-to-end encryption and now we are extending this privacy to chats with Meta AI.

    Today we're launching Incognito Chat with Meta AI, a new way to have completely private conversations with AI. Built on top of our Private Processing technology, Incognito Chat lets you talk to Meta AI in a way that is invisible to anyone else.

    Other apps have introduced incognito-style modes, but they can still see the questions coming in and the answers going out. Incognito Chat with Meta AI is truly private — no one can read your conversation, not even us.

    Since we started exploring bringing AI to WhatsApp, we've been focused on how to deliver this power privately, at a global scale.

    When you start an Incognito Chat with Meta AI, you're creating a private, temporary conversation that only you can see. Your messages are processed in a secure environment that even Meta cannot access. Your conversations are not saved and by default, your messages disappear — giving you a space to think and explore ideas without anyone watching.

    We believe this private way of chatting has potential to be part of several ways people chat with AI on WhatsApp. In the coming months, we’ll also introduce Side Chat protected by Private Processing. Side Chat with Meta AI will give you private help with any chat, with context of what's being discussed, without disrupting the main conversation.

    We remain committed to delivering privacy for the world. Incognito Chat with Meta AI is rolling out on WhatsApp and the Meta AI app over the coming months. You can learn more about how Incognito Chat with Meta AI works here.

    Original source
  • May 13, 2026
    • Date parsed from source:
      May 13, 2026
    • First seen by Releasebot:
      May 13, 2026
    Meta logo

    WhatsApp by Meta

    Introducing a Completely Private Way to Chat With AI

    WhatsApp launches Incognito Chat with Meta AI, bringing a completely private way to chat with AI on WhatsApp and the Meta AI app. Conversations are processed in a secure environment, not saved by default, and disappear for temporary, invisible chats.

    We’re launching Incognito Chat with Meta AI on WhatsApp and the Meta AI app, a completely private way to interact with AI.

    Your Incognito Chat conversations are processed in a secure environment that even Meta can’t see, and disappear by default.

    Chatting with AI has quickly become a critical part of how people get information and ask important questions. These questions can be deeply sensitive or personal, like health issues, loan details, or career advice.

    Today, we’re launching Incognito Chat with Meta AI on WhatsApp and the Meta AI app, a new way to have completely private conversations with AI. Built on top of WhatsApp’s Private Processing technology, Incognito Chat lets you talk to Meta AI in a way that is invisible to anyone else.

    Other apps have introduced incognito-style modes, but they can still see the questions coming in and the answers going out. Incognito Chat with Meta AI is truly private, meaning no one — not even Meta — can read your conversations.

    When you start an Incognito Chat with Meta AI on WhatsApp, you’re creating a private, temporary conversation that only you can see. Your messages are processed in a secure environment that even Meta cannot access. Your conversations are not saved and by default, your messages disappear — giving you space to ask questions and explore ideas without anyone watching.

    We believe this private way of chatting has potential to be part of several ways people chat with AI on WhatsApp. In the coming months, we’ll also introduce Sidechat protected by Private Processing on WhatsApp. Side Chat with Meta AI will give you private help with any WhatsApp chat with context of what’s being discussed, without disrupting the main conversation.

    Incognito Chat with Meta AI is rolling out on WhatsApp and the Meta AI app over the coming months. You can learn more about how Incognito Chat with Meta AI works here.

    Original source
  • May 6, 2026
    • Date parsed from source:
      May 6, 2026
    • First seen by Releasebot:
      May 7, 2026
    Meta logo

    Instagram Platform by Meta

    May 6, 2026

    Instagram Platform brings multiple image sending out of beta to all accounts.

    Sending multiple images is now out of beta and available to all accounts.

    Original source
  • May 6, 2026
    • Date parsed from source:
      May 6, 2026
    • First seen by Releasebot:
      May 6, 2026
    Meta logo

    React by Meta

    19.2.6 (May 6th, 2026)

    React improves Server Components with type hardening and performance boosts.

    React Server Components

    Type hardening and performance improvements

    (#36425 by @eps1lon and @unstubbable)

    Original source
  • May 6, 2026
    • Date parsed from source:
      May 6, 2026
    • First seen by Releasebot:
      May 6, 2026
    Meta logo

    React by Meta

    19.1.7 (May 6th, 2026)

    React adds Server Components type hardening and performance improvements.

    React Server Components

    Type hardening and performance improvements

    (#36425 by @eps1lon and @unstubbable)

    Original source
  • May 6, 2026
    • Date parsed from source:
      May 6, 2026
    • First seen by Releasebot:
      May 6, 2026
    Meta logo

    React by Meta

    19.0.6 (May 6th, 2026)

    React ships Server Components type hardening and performance improvements.

    React Server Components

    Type hardening and performance improvements

    (#36425 by @eps1lon and @unstubbable)

    Original source
  • April 2026
    • No date parsed from source.
    • First seen by Releasebot:
      Apr 23, 2026
    Meta logo

    Meta AI by Meta

    Segment Anything 2 Demo

    Meta AI launches Segment Anything 2 demo for video cutouts and effects with a few clicks.

    Segment Anything 2 Demo

    Create video cutouts and effects with a few clicks

    Original source
  • April 2026
    • No date parsed from source.
    • First seen by Releasebot:
      Apr 23, 2026
    Meta logo

    Meta AI by Meta

    FAIRChem v2

    Meta AI reports FAIRChem v2 introduces UMA, a universal machine learning potential with state-of-the-art accuracy.

    FAIRChem v2 introduces the UMA model — a universal machine learning potential for atoms. This is a breaking change from v1 and is not compatible with previous pretrained models.

    UMA is trained on 500M+ DFT calculations across molecules, materials, and catalysts — achieving state-of-the-art accuracy with energy conservation and fast inference.

    Original source
  • April 2026
    • No date parsed from source.
    • First seen by Releasebot:
      Apr 23, 2026
    Meta logo

    Meta AI by Meta

    Seamless Communication

    Meta AI releases Seamless Communication, a suite of AI translation models that aims to make cross-language speech more natural, expressive and fast. It includes SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2, and is publicly releasing the models, data and tools.

    AI research by Meta

    Seamless Communication

    A significant step towards removing language barriers through expressive, fast and high-quality AI translation

    A family of AI research models that enable more natural and authentic communication across languages

    The Seamless Communication models

    SeamlessExpressive

    A model that aims to preserve expression and intricacies of speech across languages.

    SeamlessStreaming

    A model that can deliver speech and text translations with around two seconds of latency.

    SeamlessM4T v2

    A foundational multilingual and multitask model that allows people to communicate effortlessly through speech and text.

    Seamless

    A model that merges capabilities from SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 into one.

    Preserving prosody

    SeamlessExpressive

    Translations should capture the nuances of human expression. While existing translation tools are skilled at capturing the content within a conversation, they typically rely on monotone, robotic text-to-speech systems for their output. SeamlessExpressive aims to preserve intricacies of speech; such as pauses and speech rate, in addition to vocal style and emotional tone.

    Try the SeamlessExpressive demo

    English input: whisper

    Please keep the volume down. We just put the baby to sleep.

    Spanish output: non-expressive

    Spanish output: expressive

    English input: sad

    Please, don't leave. I hate being here alone.

    French output: non-expressive

    French output: expressive

    Near real-time translation

    SeamlessStreaming

    SeamlessStreaming is the first massively multilingual model that delivers translations with around two-seconds of latency and nearly the same accuracy as an offline model. Built upon SeamlessM4T v2, SeamlessStreaming supports automatic speech recognition and speech-to-text translation for nearly 100 input and output languages, in addition to speech-to-speech translation for nearly 100 input languages and 36 output languages.

    Foundational model for universal translation

    SeamlessM4T v2

    In August 2023, we introduced the first version of SeamlessM4T, a foundational multilingual and multitask model that delivered state-of-the-art results for translation and transcription across speech and text. Built upon this work, our improved model, SeamlessM4T v2, serves as the foundation for our new SeamlessExpressive and SeamlessStreaming models. It features a new architecture with a non-autoregressive text to unit decoder that delivers improved consistency between text and speech output.

    More model details

    Learn more about the research behind Seamless Communication

    Try the SeamlessExpressive demo

    Try the SeamlessExpressive demo to hear how you sound in a different language while maintaining elements of your expression and tone.

    Our approach to research

    Open innovation

    We believe in the power of collaboration and open research to break down communication barriers. To enable our fellow researchers to build upon this work, we’re publicly releasing the full suite of Seamless Communication models, along with metadata, data and tools.

    Safety and responsibility

    We’re dedicated to promoting a safe and responsible AI ecosystem. We have taken a number of steps to improve the safety of our Seamless Communication models; significantly reducing the impacts of hallucinated toxicity in translations, and implementing a custom watermarking approach for audio outputs from our expressive models.

    Resources

    More on Seamless Communication

    Explore additional resources, including the research paper, model details and more.

    Technical overview

    More details on how we developed the suite of Seamless Communication models.

    Seamless research paper

    Methodology, benchmarks, research findings and more from the Seamless Communication project.

    AI at Meta blog

    Read the full post about the journey, research and milestones achieved.

    Download the models

    Get access to our suite of publicly available models.

    SeamlessExpressive Demo

    Hear how you sound in a different language while maintaining elements of your expression and tone.

    Original source
  • April 2026
    • No date parsed from source.
    • First seen by Releasebot:
      Apr 23, 2026
    Meta logo

    Meta AI by Meta

    Meta Video Seal

    Meta AI introduces Video Seal, an open-source video watermarking model that embeds durable, invisible watermarks and hidden messages to help verify video origin even after editing.

    Introducing Meta Video Seal

    A state-of-the-art, open-source model for video watermarking

    With AI-generated content on the rise, verifying video origins is crucial. Video Seal is a neural watermarking model that embeds durable, invisible watermarks - even after video editing.

    Imperceptible watermarks

    Video Seal embeds an invisible watermark into videos, with the option to include a hidden message.

    Robust and Resilient

    Video Seal's watermarks are resilient, withstanding distortion efforts such as flipping and blurring.

    Origin Verification

    The watermark and hidden message can be revealed to verify the video's origin.

    How the demo works

    1. Choose a video from the library to explore the model, or upload your own to get started.
    2. Embed up to a 6-character hidden message and watermark in your video.
    3. Use the comparison slider to view an enhanced X-ray visualization of the watermark on the video.
    4. Stress test the watermark by distorting the video and verifying if the watermark and hidden message remain detectable.
    Original source
  • April 2026
    • No date parsed from source.
    • First seen by Releasebot:
      Apr 23, 2026
    Meta logo

    Meta AI by Meta

    Introducing Meta Motivo

    Meta AI releases Meta Motivo, a behavioral foundation model for zero-shot control of a virtual physics-based humanoid. It also adds a new humanoid benchmark, training code, and a demo, with strong whole-body task performance across motion tracking, pose reaching, and reward optimization.

    A Meta FAIR release

    Introducing Meta Motivo

    A first-of-its-kind behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.

    Try the demo

    Download the model

    Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models

    Meta Motivo is a behavioral foundation model pre-trained with a novel unsupervised reinforcement learning algorithm to control the movements of a complex virtual humanoid agent. At test time, our model can be prompted to solve unseen tasks such as motion tracking, pose reaching, and reward optimization without any additional learning or fine-tuning.

    Read the research paper

    Physics-based environment

    The model has learned to control the agent, subject to the physics of its body and environment. Its behaviors are robust to variations and perturbations.

    Different prompts for behaviors

    The model can be prompted with motions to track, poses to reach, and rewards to optimize.

    Zero-shot capability

    The model computes the best behavior for each prompt without any additional learning or fine-tuning.

    Explore the Research

    We are releasing the pre-trained model together with the new humanoid benchmark and the training code. We hope this will encourage the community to further develop research towards building behavioral foundation models that can generalize to more complex tasks, and potentially different types of agents.

    Key takeaways

    • We introduce a new algorithm grounding the forward-backward unsupervised reinforcement learning method with an imitation objective leveraging a dataset of unsupervised trajectories.
    • With this new approach, we train Meta Motivo, a behavioral foundation model that controls a high-dimensional virtual humanoid agent to solve a wide range of tasks.
    • We evaluated our model using a new humanoid benchmark across motion tracking, pose reaching, and motion tracking tasks. Meta Motivo achieved competitive performance with task-specific methods, while outperforming state-of-the-art unsupervised RL and model-based baselines.

    The Algorithm

    Forward-Backward representations with Conditional Policy Regularization (FB-CPR) is a novel algorithm combining unsupervised forward-backward representations [1, 2, 3] with an imitation learning loss regularizing policies to cover states observed in a dataset of unlabeled trajectories. Our algorithm is trained online through direct access to the environment and it crucially learns a representation that aligns the embedding of states, motions, and rewards into the same latent space. As a result, we can train models whose policies are grounded towards useful behaviors, while being capable of zero-shot inference across a wide range of tasks, such as goal-based RL, imitation learning, reward optimization, and tracking.

    The final model includes two components: 1) an embedding network that receives as input the state of the agent and it returns its embedding; 2) a policy network parameterized with the same embedding that receives an input the state and returns the action to take.

    Inference from various types of prompts

    Our algorithm learns a representation that aligns states, rewards, and policies into the same latent space. We can then leverage this representation to perform zero-shot inference for different tasks

    Motion tracking

    Pose reaching

    Reward optimization

    Performance improvement during pre-training

    Meta Motivo is a behavioral foundation model trained on a SMPL-based humanoid simulated with the Mujoco simulator using a subset of the AMASS motion capture dataset and 30 million online interaction samples.

    The videos below illustrate the behaviors corresponding to one motion tracking task (a cartwheel motion), one pose reaching task (an arabesque pose), and one reward optimization task (running) at different stages of the pre-training process. Despite the model not being explicitly trained to optimize any of these tasks, we see the performance improving during training and more human-like behaviors emerge.

    Motion tracking

    Pose reaching

    Reward optimization

    Evaluation Results

    For evaluation, we have developed a new humanoid benchmark including motions to track, stable poses to reach, and reward functions to optimize. We consider several different baselines including 1) methods that are retrained for each task separately; 2) behavioral foundation models and model-based algorithms. We are releasing the code with the specification files needed to use the simulator and evaluate the model performance on the tasks that are used in the paper.

    Quantitative

    Our model achieves between 61% to 88% of the performance of top-line methods retrained for each task, while outperforming all other algorithms except for the tracking: in this case it is second best behind Goal-TD3, which cannot be used for reward-based tasks.

    Results

    Motion tracking

    Pose reaching

    Reward optimization

    Qualitative

    To further analyze the performance gap in reward-based and goal-based tasks between Meta Motivo and single-task TD3, we ran a human evaluation with the objective of having a qualitative assessment of the learned behaviors in terms of human-likeness. This evaluation reveals that policies purely optimized for performance (TD3) produce much less natural behaviors than Meta Motivo, which better trades off performance and qualitative behaviors.

    Results

    Pose reaching

    Reward optimization

    Understanding the behavioral latent space

    One of the crucial aspects of our new algorithm is that it uses the same representation to embed states, rewards, and motions in the same space. We have then investigated the structure of the learned behavioral latent space.

    Visualization

    Interpolation

    In the image above, we visualize the embedding of motions classified by their activity (e.g., jumping, running, crawling) and reward-based tasks. Not only does the representation capture semantically similar motions in similar clusters, but it creates a latent space where rewards and motions are well aligned.

    Limitations

    Meta Motivo is our first attempt to train behavioral foundation models with zero-shot capabilities across several different prompt types. While the model achieved strong quantitative and qualitative results, it still suffers from several limitations.

    Motion tracking

    Pose reaching

    Reward optimization

    Fast movements and motions on the ground are poorly tracked. The model also exhibits unnatural jittering.

    Try it yourself

    Control the behavior of an embodied virtual agent through various prompts, including creating your own! See how the agent adjusts to changes in physics and environmental conditions, like gravity and wind.

    Try the demo

    References

    1. Ahmed Touati, Yann Ollivier, Learning One Representation to Optimize All Rewards, NeurIPS 2021
    2. Ahmed Touati, Jérémy Rapin, Yann Ollivier, Does Zero-shot Reinforcement Learning Exist?, ICLR 2023
    3. Matteo Pirotta, Andrea Tirinzoni, Ahmed Touati, Alessandro Lazaric, Yann Ollivier, Fast Imitation via Behavior Foundation Models, ICLR 2024
    4. Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black, SMPL: a skinned multi-person linear model, ACM Transactions on Graphics 2015.
    5. MuJoCo - Advanced physics simulation
    6. Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, and Michael J. Black. AMASS: archive of motion capture as surface shapes, ICCV 2019.
    7. https://github.com/facebookresearch/humenv

    Acknowledgements

    Research Authors

    Andrea Tirinzoni, Ahmed Touati, Jesse Farebrother, Mateusz Guzek, Anssi Kanervisto, Yingchen Xu, Alessandro Lazaric, Matteo Pirotta

    Project Contributors (alphabetical)

    Claire Roberts, Dominic Burt, Jiemin Zhang, Leonel Sentana, Maria Ruiz, Matt Hanson, Morteza Behrooz, Ryan Winstead, Spaso Ilievski, Vincent Moens, Vlad Bodurov, William Ngan

    © 2024 Meta

    Original source
  • April 2026
    • No date parsed from source.
    • First seen by Releasebot:
      Apr 23, 2026
    Meta logo

    Meta AI by Meta

    DINOv3

    Meta AI releases DINOv3, a self-supervised vision foundation model that brings stronger universal backbones, dense image features, and broad performance across detection, segmentation, depth estimation, and tracking. It also expands the model suite with efficient options for diverse deployment needs.

    INTRODUCING DINOV3

    Self-supervised learning for vision at unprecedented scale

    DINOv3 scales self-supervised learning (SSL) for images to produce our strongest universal vision backbones, enabling breakthrough performance across diverse domains.

    Download DINOv3

    Read the research paper

    DINOV3 OVERVIEW

    Cutting-edge image representations, trained without human supervision

    We scaled unsupervised training to 7B-parameter models and 1.7B image datasets, using a fraction of compute compared to weakly-supervised methods. Despite keeping backbones frozen during evaluation, they achieve absolute state-of-the-art performance across diverse domains.

    Read the research paper

    Exceptional performance across visual domains

    SSL unlocks domains where annotations are scarce or costly. Backbones enable state-of-the-art results for tasks including object detection in web imagery, but also canopy height mapping in satellite and aerial imagery.

    Versatile backbone with powerful dense image features

    High-resolution dense features from a single DINOv3 backbone enable leading performance across vision tasks, including object detection, depth estimation, and segmentation, without any finetuning.

    Efficient model sizes and architectures

    We release a comprehensive model suite addressing a wide range of use cases, including broad coverage of ViT sizes and efficient ConvNeXt models for on-device deployment.

    PERFORMANCE

    Evaluating DINOv3's Performance

    DINOv3 sets a new standard in vision foundation models. For the first time, a model trained with SSL outperforms weakly-supervised models on a broad range of probing tasks, from fine-grained image classification, to semantic segmentation, to object tracking in video.

    APPLICATIONS

    DINO in action

    From challenging annotation scenarios to efficiency-critical deployments, see how researchers and developers use DINO to build breakthrough applications.

    Download DINOv3

    World Resources Institute

    WRI measures tree canopy heights with DINO, helping civil society organizations worldwide monitor reforestation.

    Learn more

    NASA JPL

    NASA JPL uses DINO for Mars exploration robots, enabling multiple vision tasks with minimal compute.

    Learn more

    Orakl Oncology & CentraleSupelec

    Orakl Oncology & CentraleSupelec pre-trains DINO on organoid images, producing a backbone to power prediction of patient responses to cancer treatments.

    Learn more

    APPROACH

    Self-supervised pre-training unlocks simple task adaptation

    Pre-training data is curated from a large unlabeled dataset. During pre-training, the model learns general-purpose visual representations, matching features between different augmented views of the same image. In post-training, the model is distilled into more efficient models.

    A pre-trained DINOv3 model can be easily tailored by training a lightweight adapter on a small amount of annotated data.

    DINO Evolution

    DINOv3 marks a new milestone in self-supervised training at scale. It builds upon the scaling progress of DINOv2, further increasing the model size x6, and training data x12.

    DINO

    Initial research proof-of-concept, with 80M-parameter models trained on 1M images.

    Read the research paper

    Download the model

    DINOv2

    First successful scaling of a SSL algorithm. 1B-parameter models trained on 142M images.

    Read the research paper

    Download the model

    DINOv3

    An order of magnitude larger training compared to v2, with particular focus on dense features.

    Read the research paper

    Download the model

    Explore additional resources

    Read the AI at Meta blog

    Read the research paper

    Download DINOv3

    DINOv3 on Hugging Face

    Original source
  • April 2026
    • No date parsed from source.
    • First seen by Releasebot:
      Apr 23, 2026
    Meta logo

    Meta AI by Meta

    Introducing Meta Segment Anything Model Audio (SAM Audio)

    Meta AI launches SAM Audio, a multimodal sound separation model that uses text, visual, and span prompts to isolate target audio from complex mixes. It also adds PE-AV to Perception Encoder and releases a new OSS evaluation set with a judge model.

    With SAM Audio, you can use simple text prompts to accurately separate any sound from any audio or audio-visual source.

    SAM AUDIO CAPABILITIES

    SAM Audio separates target and residual sounds from any audio or audiovisual source—across general sound, music, and speech.

    Text prompts

    SAM Audio enables you to use text-based prompts to describe the specific target audio they want to separate.

    Visual prompts

    SAM Audio lets you pick out and separate sounds by clicking on the part of the video where you hear them.

    Span prompts

    SAM Audio is the first model to introduce span prompting, selecting the desired point in the timespan that contains the target audio.

    Multi-modal prompts

    SAM Audio provides you flexibility with three unifying prompt modalities (text, visual, timespan).

    A NEW WAY TO EXPERIENCE SOUND

    State-of-the-art model for all sound

    SAM Audio is a state-of-the-art, unified multimodal model that sets a new standard for audio separation, enabling users to isolate general sounds, music, and speech from complex mixtures using intuitive prompts.

    PERFORMANCE

    State-of-the-art model performance

    SAM Audio achieves beyond state-of-the-art performance for all prompting capabilities.

    OUR APPROACH

    Model architecture

    SAM Audio is a generative separation model that extracts both target and residual stems from an audio mixture using text, visual, or temporal prompts. It is powered by a flow-matching Diffusion Transformer and operates in a DAC-VAE latent space, enabling high-quality joint generation of target and residual audio.

    OUR APPROACH

    Audiovisual Perception Encoder

    PE-AV is a new open source model, bringing audio capabilities to Meta's Perception Encoder.

    THE SAM AUDIO EVALUATION DATASET

    A first-of-its-kind audio separation OSS evaluation set

    SAM Audio is releasing a first-of-its-kind OSS evaluation set for prompted audio separation and a judge model highly correlated with human subjective evaluation.

    Real world opportunities

    "Artificial Intelligence has been a game changer for the disabled community and the use cases for AI-focused start-ups in our ecosystem are vast. By incorporating open source models like SAM Audio into their work, 2GI’s cohort participants can advance their missions while gaining competitive advantage, showcasing that disabled founders are on the cutting edge of technology."

    • Diego Mariscal, CEO of 2gether-International

    2gether-International empowers disabled founders with resources to launch high-impact startups. In partnership with Meta’s AI for Good team, 2GI leverages open AI models like SAM Audio to accelerate innovation for early-stage, founder-led AI companies.

    "For years, Starkey has led the industry in applying artificial intelligence to revolutionize hearing technology. Our ground-breaking work continues to elevate what hearing aids can achieve, particularly in challenging listening situations like noisy environments and overlapping speech. With open models like SAM audio, we see tremendous opportunity to build on our innovations and further our mission to help people hear better and live better."

    • Achin Bhowmik, Chief Technology Officer and Executive Vice President of Engineering at Starkey

    Starkey is the global leader in hearing technology and the only global American-owned hearing aid manufacturer. Using AI, Starkey transforms hearing aids into smart health and communication devices—delivering innovative, connected solutions that enhance lives

    Original source
Releasebot

Curated by the Releasebot team

Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.

Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.

Similar to Meta with recent updates: