AI Language Models Release Notes
Release notes for large language models, APIs and AI platforms
Products (25)
Latest AI Language Models Updates
- Mar 6, 2026
- Date parsed from source:Mar 6, 2026
- First seen by Releasebot:Mar 6, 2026
The latest AI news we announced in February
Google highlights February AI updates with Gemini 3.1 Pro and Nano Banana 2 launches, Flow enhancements, and Lyria 3 music generation. The roundup covers real product releases and availability for developers and consumers, plus Deep Think upgrade and new AI partnerships shaping user-facing tools.
Here’s a recap of our biggest AI updates from February, including highlights from the AI Impact Summit in India, the release of Gemini 3.1 Pro and Nano Banana 2.
For more than 20 years, we’ve invested in machine learning and AI research, tools and infrastructure to build products that make everyday life better for more people. Teams across Google are working on ways to unlock AI’s benefits in fields as wide-ranging as healthcare, crisis response and education. To keep you posted on our progress, we're doing a regular roundup of Google's most recent AI news.
Here’s a look back at some of our AI announcements from February.
For us, February was about global impact. At the AI Impact Summit in India, we demonstrated how our ongoing breakthroughs in AI are now solving real-world challenges for people everywhere — and we launched new partnerships and investments to make sure everyone benefits. We see AI as an enabling technology that can help people achieve their goals — whether you're a researcher, entrepreneur or Olympic athlete. On the slopes, in a research lab, or right in the palm of your hand, Google's latest AI announcements are here to help you.
AI to help everyone dream bigger
We announced new partnerships and investments at the AI Impact Summit. As world leaders gathered in New Delhi, India, we shared how we’re partnering to make AI work for everyone. That includes new Impact Challenges to help advance science and spark innovation for governments, as well as new national partnerships in India for AI and collaborations to accelerate scalable AI solutions in science and education.
CEO Sundar Pichai delivered opening remarks at the AI Impact Summit. Sundar explained why “no technology has [him] dreaming bigger than AI” and called on leaders to pursue AI boldly, approach it responsibly, and work through this moment in AI’s development together. He shared ways that Google is ensuring everyone benefits with major infrastructure investments and new AI skills training.
AI to help express your creativity
We released Nano Banana 2, combining Pro image capabilities with Flash image speed. That means you can now access high-quality image generation with faster results across products like the Gemini app and Google Search. We’re also continuing to improve tools like SynthID to help you identify AI-generated content. Developers can now build with Nano Banana 2 and deploy sophisticated visual creation at scale with an amazing price-performance ratio.
We released our most advanced music generation tools. Lyria 3 allows you to create custom music in the Gemini app. That means you can describe an idea or upload a photo or video, and Gemini will generate a 30-second track with custom cover art. On top of sharing that news, we also shared six tips to get you started prompting Lyria 3. And as an added creative tool, we also announced that ProducerAI is joining Google Labs. Whether you’re refining lyrics or a melody, ProducerAI is a music creation partner that can help turn your imagination into dynamic, comprehensive songs.
We shared new ways to create images and videos in Flow. To help you generate, edit and animate images and videos in a single workspace, we’re bringing our top AI capabilities into Flow. You can create high-fidelity images and instantly use them as building blocks for video generation, all in one place. With an updated interface, it’s now even easier to search, filter and manage your assets.
AI to help clarify and manage complex challenges
We released Gemini 3.1 Pro to help tackle your most complex tasks. Gemini 3.1 Pro is a smarter, more capable baseline model for complex problem-solving, demonstrating more than double the reasoning performance of 3 Pro. It’s designed to help you when a simple answer isn’t enough, whether you’re looking for a clear, visual explanation of a topic, synthesizing data into a single view or pulling together a creative project. Gemini 3.1 Pro is available to developers, enterprises and consumers via various platforms.
We released a major upgrade to Gemini 3 Deep Think. We collaborated with world-class scientists and researchers to improve Gemini 3 Deep Think. Designed specifically for the complexities of science and engineering, the updated Gemini 3 Deep Think excels where data is messy and solutions aren't black-and-white. It moves beyond abstract theory to deliver practical, actionable results for technical challenges. The new Deep Think is now available in the Gemini app for Google AI Ultra subscribers. Researchers, engineers and enterprises can express interest in early access to test Deep Think via the Gemini API.
We shared our view on what’s required to achieve digital resilience in the AI era at MSC. New technologies mean new frontiers for strategic competition. We’re already seeing how threats are evolving, and how old ways of responding are failing to meet the moment. That’s why at the 62nd Munich Security Conference, Google President of Global Affairs Kent Walker called for a collaborative approach to security and outlined how partners could work together to build resilience without sacrificing control over their data.
AI to help athletes (and their fans) elevate their game
We shared how Google Cloud helped Team USA find their edge with AI. Ahead of the Olympic Winter Games, Google Cloud and Google DeepMind built an AI video analysis tool to help Team USA and U.S. Ski & Snowboard elite athletes analyze their tricks. Using Google DeepMind’s research into spatial intelligence, the platform maps an athlete’s motion directly from 2D video images — even through bulky winter gear. The tool, which runs on Google Cloud, processes this data in minutes, providing near real-time feedback that athletes and coaches could use to make adjustments and help elevate performance.
We shared our new Gemini ad for football’s biggest weekend. In our national in-game spot, "New Home," a mother and son use Gemini to bring their new house to life, imagining how different spaces will look and feel. The spot, named by the Kellogg School as the best in-game ad in its annual ranking, played during the big game and highlighted just a few of the amazing things people can do — and are doing — with Gemini.
Original source Report a problem - Mar 5, 2026
- Date parsed from source:Mar 5, 2026
- First seen by Releasebot:Mar 6, 2026
Introducing ChatGPT for Excel and new financial data integrations
OpenAI announces ChatGPT for Excel in beta, an Excel add-in that builds, updates, and analyzes models directly in workbooks. It adds financial data integrations and uses GPT-5.4 Thinking to boost finance workflows. Rollout starts for select business plans in US, Canada, Australia with enterprise controls.
Use ChatGPT in Excel to build, update, and analyze spreadsheets faster, and new integrations in ChatGPT for financial workflows.
Today, we’re introducing ChatGPT for Excel (opens in a new window) in beta, an Excel add-in that brings ChatGPT directly into workbooks to help build and update models, run scenarios, and generate outputs based on cells and formulas. Powered by GPT‑5.4, it helps users do more in Excel, supports power users in moving faster, and can improve consistency across teams.
We’re also adding financial data integrations directly in ChatGPT for FactSet, Dow Jones Factiva, LSEG, Daloopa, S&P Global, and more, making it easier to work with trusted financial data inside ChatGPT. Together, these capabilities help teams spend less time on manual work and more time on analysis, decisions, and execution.
An AI model optimized for finance workflows
GPT‑5.4 (as GPT‑5.4 Thinking) is available today in ChatGPT, Codex, and the API. It’s our most advanced model, ideal for financial reasoning and Excel-based modeling. We’ve worked closely with industry practitioners to improve GPT‑5.4 on real-world finance workflows that often take analysts hours or days to complete, including financial modeling, scenario analysis, data extraction, and long-form research. The result is stronger performance on the tasks finance professionals rely on every day.
On OpenAI’s internal investment banking benchmark, which evaluates real-world workflows such as building a three-statement model with proper formatting and citations, performance improved from 43.7% with GPT‑5 to 87.3% with GPT‑5.4 Thinking.
ChatGPT for Excel in beta: build, update, and analyze spreadsheet models directly in your workbook
We’re introducing ChatGPT for Excel in beta—a version of ChatGPT embedded directly in spreadsheets that can build, analyze, and update models using the same formulas and structures teams already rely on. Analysts, strategists, researchers, and accountants can move faster, reduce manual work, and focus on judgment and decision-making instead of writing formulas, tracing links, and fixing models.
How it works
Build and update spreadsheet models faster. Instead of building spreadsheet models or running scenario analysis manually, teams can describe what they need in plain language, and ChatGPT will create or update live Excel models directly in the workbook. Teams can run data analysis, reporting, inventory management, budgeting—all while preserving structure, formulas, and assumptions in a formatted, Excel-native workbook.
Get insights from large spreadsheets without manual reconciliation. ChatGPT can reason across workbooks, understand how sheets and formulas connect across the model, explain why outputs changed, trace and fix errors, and show how assumptions flow through a model. This is especially useful when users inherit existing templates, need to get up to speed quickly, or want to understand and test a workbook before making decisions.
Follow the logic and trust the outputs. ChatGPT explains what it’s doing as it works and links answers to the exact cells it references and updates. Because calculations run directly in Excel, teams can trace assumptions, audit formulas, and verify how results were produced. Before making changes to a workbook, ChatGPT asks for permission, so users can review each step and undo edits if needed.
Known limitations in beta
We are improving ChatGPT for Excel quickly based on user feedback. Some responses may take longer as we optimize performance, and generated outputs may occasionally require cleanup or adjustment to match preferred spreadsheet formatting or layout conventions. ChatGPT can generate and explain formulas, but complex formulas or edge cases may still require manual refinement.
Getting started
Starting today, ChatGPT for Excel (opens in a new window) in beta is rolling out for ChatGPT Business, Enterprise, Edu, Teachers, Pro, and Plus users in the U.S., Canada, and Australia. ChatGPT for Google Sheets is coming soon.
In Enterprise, Edu, and Teacher workspaces, access is off by default. Admins can enable it for specific users with custom roles and group permissions.
Financial data integrations in ChatGPT
For teams working in financial workflows, new data integrations and support for proprietary data through building your own apps using Model Context Protocol (MCP) (opens in a new window), make it easier to bring market, company, and internal data into a single workflow in ChatGPT. With GPT‑5.4, ChatGPT can handle longer context and more complex tasks, helping teams move faster on company research, model refreshes, and cited outputs for valuation, diligence, underwriting, and related work.
Simplify research and analysis
Integrations released today—including Moody’s, Dow Jones Factiva, MSCI, Third Bridge, and MT Newswire, with FactSet coming soon—bring market, company, and internal data into a single workflow in ChatGPT, as part of a growing ecosystem of apps. This helps users spend less time gathering inputs to produce cited outputs such as earnings summaries, valuation snapshots, and credit memos faster.
Quickly conduct due diligence
Teams can also use apps with research in ChatGPT to pull from filings, transcripts, decks, and spreadsheets to produce structured, cited outputs that export to PDF or Microsoft Word. Recent updates give users more control over the research process, including the ability to focus on specific websites and data sources, shape the research plan before and during a run, and review sources and citations in a redesigned workspace.
Security, governance, and control
For organizations adopting ChatGPT at work, ChatGPT Enterprise includes the security, governance, and access controls needed to use ChatGPT confidently, especially in regulated or data-sensitive environments:
Manage and monitor access with RBAC, SAML SSO, SCIM, and audit logs, with support for common DLP and SIEM tools.
Protect firm data with encryption in transit with TLS 1.2+ and at rest with AES-256, plus enterprise key management support.
Meet regional data requirements with data residency and regional processing controls.
By default, data shared with ChatGPT Enterprise is not used to train or improve our models.
Learn more about our enterprise-grade security, privacy, and compliance programs. Explore apps today (opens in a new window), or contact our team to learn more.
Customer impact
We’re working closely with financial institutions as they apply ChatGPT across research, underwriting, auditing, client engagement, code modernization, and operations. Across banks, asset managers, and insurance, we’re seeing impact in workflows like due diligence, client experience, and investment research—and we’ll keep learning alongside customers as they scale their AI deployments.
“ChatGPT has materially accelerated our research and due diligence workflows—from financial analysis and market research to legal review and writing internal memos—while improving consistency across teams. It has expanded our team’s capacity, freeing our investment professionals to focus more time on judgment, debate, and conviction. We’re excited to be early adopters of new capabilities and to help shape how AI transforms financial services in the years ahead.”
—Amr Ellabban, PhD, Head of AI, Hg
Looking ahead
This launch builds on OpenAI’s ongoing work with analysts, strategists, researchers, and accountants. We’re learning from real-world deployments to improve our products and models and help institutions move faster while operating responsibly in regulated environments.
To learn more, contact our team. Enterprise customers can build directly with OpenAI or work alongside experienced partners such as Accenture, Bain, Boston Consulting Group (BCG), McKinsey & Company, and PwC to integrate AI into existing data, applications, and operating models.
Original source Report a problem All of your release notes in one feed
Join Releasebot and get updates from Google and hundreds of other software products.
- Mar 5, 2026
- Date parsed from source:Mar 5, 2026
- First seen by Releasebot:Mar 6, 2026
March 5, 2026
OpenAI unveils GPT-5.4 Thinking in ChatGPT, a major upgrade that fuses advanced reasoning, coding, and agentic workflows into a single frontier model. It adds an upfront thinking plan for midcourse tweaks, stronger deep web research, and better context retention for long tasks.
GPT-5.4 Thinking in ChatGPT
GPT-5.4 brings together the best of our recent advances in reasoning, coding, and agentic workflows into a single frontier model. It incorporates the industry-leading coding capabilities of GPT-5.3-Codex while improving how the model works across tools, software environments, and professional tasks involving spreadsheets, presentations, and documents. The result is a model that gets complex real work done accurately, effectively, and efficiently—delivering what you asked for with less back and forth.
In ChatGPT, GPT-5.4 Thinking can now provide an upfront plan of its thinking, so you can adjust course mid-response while it's working, and arrive at a final output that's more closely aligned with what you need without additional turns. GPT-5.4 Thinking also improves deep web research, particularly for highly specific queries, while better maintaining context for questions that require longer thinking. Together, these improvements mean higher-quality answers that arrive faster and stay relevant to the task at hand.
Original source Report a problem - Mar 5, 2026
- Date parsed from source:Mar 5, 2026
- First seen by Releasebot:Mar 6, 2026
Introducing GPT-5.4 | OpenAI
GPT‑5.4 debuts across ChatGPT, API and Codex with Thinking and Pro variants, boosting reasoning, coding and real‑world workflow ability. New tool search, 1M context, native computer use, faster token efficiency, and stronger web search. Includes steerable conversations and enhanced safety measures.
Today, we’re releasing GPT‑5.4 in ChatGPT (as GPT‑5.4 Thinking), the API, and Codex. It’s our most capable and efficient frontier model for professional work. We’re also releasing GPT‑5.4 Pro in ChatGPT and the API, for people who want maximum performance on complex tasks.
GPT‑5.4 brings together the best of our recent advances in reasoning, coding, and agentic workflows into a single frontier model. It incorporates the industry-leading coding capabilities of GPT‑5.3‑Codex while improving how the model works across tools, software environments, and professional tasks involving spreadsheets, presentations, and documents. The result is a model that gets complex real work done accurately, effectively, and efficiently—delivering what you asked for with less back and forth.
In ChatGPT, GPT‑5.4 Thinking can now provide an upfront plan of its thinking, so you can adjust course mid-response while it’s working, and arrive at a final output that’s more closely aligned with what you need without additional turns. GPT‑5.4 Thinking also improves deep web research, particularly for highly specific queries, while better maintaining context for questions that require longer thinking. Together, these improvements mean higher-quality answers that arrive faster and stay relevant to the task at hand.
In Codex and the API, GPT‑5.4 is the first general-purpose model we’ve released with native, state-of-the-art computer-use capabilities, enabling agents to operate computers and carry out complex workflows across applications. It supports up to 1M tokens of context, allowing agents to plan, execute, and verify tasks across long horizons. GPT‑5.4 also improves how models work across large ecosystems of tools and connectors with tool search, helping agents find and use the right tools more efficiently without sacrificing intelligence. Finally, GPT‑5.4 is our most token efficient reasoning model yet, using significantly fewer tokens to solve problems when compared to GPT‑5.2—translating to reduced token usage and faster speeds.
Together with advances in general reasoning, coding, and professional knowledge work, GPT‑5.4 enables more reliable agents, faster developer workflows, and higher-quality outputs across ChatGPT, the API, and Codex.
Knowledge work
Building on GPT‑5.2’s general reasoning capabilities, GPT‑5.4 delivers even more consistent and polished results on real-world tasks that matter to professionals.
On GDPval, which tests agents’ abilities to produce well-specified knowledge work across 44 occupations, GPT‑5.4 achieves a new state of the art, matching or exceeding industry professionals in 83.0% of comparisons, compared to 70.9% for GPT‑5.2.
In GDPval, models attempt well-specified knowledge work spanning 44 occupations from the top 9 industries contributing to U.S. GDP. Tasks request real work products, such as sales presentations, accounting spreadsheets, urgent care schedules, manufacturing diagrams, or short videos. Reasoning effort was set to xhigh for GPT‑5.4 and heavy for GPT‑5.2 (a slightly lower level in ChatGPT).
“GPT-5.4 is the best model we’ve ever tried. It’s now top of the leaderboard on our APEX-Agents benchmark, which measures model performance for professional services work. It excels at creating long-horizon deliverables such as slide decks, financial models, and legal analysis, delivering top performance while running faster and at a lower cost than competitive frontier models.”
— Brendan Foody, CEO at MercorWe put a particular focus on improving GPT‑5.4’s ability to create and edit spreadsheets, presentations, and documents. On an internal benchmark of spreadsheet modeling tasks that a junior investment banking analyst might do, GPT‑5.4 achieves a mean score of 87.3%, compared to 68.4% for GPT‑5.2. On a set of presentation evaluation prompts, human raters preferred presentations from GPT‑5.4 68.0% of the time over those from GPT‑5.2 due to stronger aesthetics, greater visual variety, and more effective use of image generation.
Documents were generated with reasoning effort set to xhigh
You can try these capabilities in ChatGPT using GPT‑5.4 Thinking or Pro. If you’re an Enterprise customer, we recommend using our newly released ChatGPT for Excel add-in (opens in a new window), which was also launched today. We've also updated our spreadsheet (opens in a new window) and presentation skills (opens in a new window) available in Codex and the API.
To make GPT‑5.4 better at real-world work, we continued our progress at driving down hallucinations and errors. GPT‑5.4 is our most factual model yet: on a set of de-identified prompts where users flagged factual errors, GPT‑5.4’s individual claims are 33% less likely to be false and its full responses are 18% less likely to contain any errors, relative to GPT‑5.2.
“GPT-5.4 sets a new bar for document-heavy legal work. On our BigLaw Bench eval, it scored 91%. Compared to other models, GPT‑5.4 is currently better at structuring complex transactional analysis, maintaining accuracy across lengthy contracts, and delivering the high level of detail legal practitioners require.”
— Niko Grupen, Head of Applied Research at HarveyComputer use and vision
GPT‑5.4 is our first general-purpose model with native computer-use capabilities and marks a major step forward for developers and agents alike. It’s the best model currently available for developers building agents that complete real tasks across websites and software systems.
We’ve designed GPT‑5.4 to be performant across a wide range of computer-use workloads. It is excellent at writing code to operate computers via libraries like Playwright, as well as issuing mouse and keyboard commands in response to screenshots. Its behavior is steerable via developer messages, meaning that developers can adjust behavior to suit particular use cases. Developers can even configure the model’s safety behavior to suit different levels of risk tolerance by specifying custom confirmation policies.
The model’s performance and flexibility are reflected across benchmarks that test computer use across different settings. On OSWorld-Verified, which measures a model’s ability to navigate a desktop environment through screenshots and keyboard/mouse actions, GPT‑5.4 achieves a state-of-the-art 75.0% success rate, far exceeding GPT‑5.2’s 47.3%, and surpassing human performance at 72.4%.
On WebArena-Verified, which tests browser use, GPT‑5.4 achieves a leading 67.3% success rate when using both DOM- and screenshot-driven interaction, compared to GPT‑5.2’s 65.4%. On Online-Mind2Web, which also tests browser use, GPT‑5.4 achieves a 92.8% success rate using screenshot-based observations alone, improving over ChatGPT Atlas’s Agent Mode, which achieves a success rate of 70.9%.
GPT‑5.4 interprets screenshots of a browser interface and interacts with UI elements through coordinate-based clicking to send emails and schedule a calendar event. Video is not sped up.
GPT‑5.4’s improved computer use is built on the model’s improved general visual perception capabilities. On MMMU-Pro, a test of a model’s visual understanding and reasoning, GPT‑5.4 achieves an 81.2% success rate without tool use, an improvement over GPT‑5.2’s 79.5%. Improved visual perception also translates into better document parsing capabilities. On OmniDocBench, GPT‑5.4 without reasoning effort achieves an average error (measured by normalized edit distance between model prediction and ground truth) of 0.109, improved from GPT‑5.2’s 0.140.
MMMUPro was run with reasoning effort set to xhigh. OmniDocBench was run with reasoning effort set to none, to reflect low-cost, low-latency performance.
We’re also improving visual understanding for dense, high-resolution images where full fidelity matters. Starting with GPT‑5.4, we’re introducing an original image input detail (opens in a new window) level which supports full-fidelity perception up to 10.24M total pixels or 6000-pixel maximum dimension, whichever is lower; the high image input detail level now supports up to 2.56M total pixels or a 2048-pixel maximum dimension. In early testing with API users, we observed strong gains in localization ability, image understanding, and click accuracy when using original or high detail.
“In our evals measuring computer use performance across ~30K HOA and property tax portals, GPT-5.4 achieved a 95% success rate on the first attempt and 100% within three attempts, compared to ~73–79% with prior CUA models. It also completed sessions ~3x faster while using ~70% fewer tokens, materially improving reliability and cost efficiency at scale.”
— Dod Fraser, CEO at MainstayIn the API, developers can access these capabilities using the updated computer tool. Please see our updated documentation (opens in a new window) for recommended best practices.
Coding
GPT‑5.4 combines the coding strengths of GPT‑5.3‑Codex with leading knowledge work and computer-use capabilities, which matter most on longer-running tasks where the model can use tools, iterate, and push work further with less manual intervention. It matches or outperforms GPT‑5.3‑Codex on SWE-Bench Pro while being lower latency across reasoning efforts.
We estimate latency by looking at the production behavior of our models, and simulating this offline. The latency estimate accounts for tool call duration (code execution time), sampled tokens, and input tokens. Real-world latency may vary substantially, and depends on many factors not captured in our simulation. Reasoning efforts were swept from none to xhigh.
When toggled on, /fast mode in Codex delivers up to 1.5x faster token velocity with GPT‑5.4. It’s the same model and the same intelligence, just faster. That means users can move through coding tasks, iteration, and debugging while staying in flow. Developers can access GPT‑5.4 at the same fast speeds via the API by using priority processing (opens in a new window).
In evaluation and internal testing we found that GPT‑5.4 excels at complex frontend tasks, with noticeably more aesthetic and more functional results than any models we’ve launched previously.
As a demonstration of the model’s improved computer-use and coding capabilities working in tandem, we’re also releasing an experimental Codex skill called “Playwright (Interactive) (opens in a new window)”. This allows Codex to visually debug web and Electron apps; it can even be used to test an app it’s building, as it’s building it.
“GPT-5.4 is currently the leader on our internal benchmarks. Our engineers find it to be more natural and assertive than previous models. It works through ambiguous problems without second-guessing itself, and it's proactive about parallelizing work to keep things moving.”
— Lee Robinson, VP of Developer Education at CursorTool use
With GPT‑5.4, we’ve significantly improved how models work with external tools. Agents can now operate across larger tool ecosystems, choose the right tools more reliably, and complete multi-step workflows with lower cost and latency.
Tool search
In the API, GPT‑5.4 introduces tool search (opens in a new window), which allows models to work efficiently when given many tools.
Previously, when a model was given tools, all tool definitions were included in the prompt upfront. For systems with many tools, this could add thousands—or even tens of thousands—of tokens to every request, increasing cost, slowing responses, and crowding the context with information the model might never use.
With tool search, GPT‑5.4 instead receives a lightweight list of available tools along with a tool search capability. When the model needs to use a tool, it can look up that tool’s definition and append it to the conversation at that moment.
This approach dramatically reduces the number of tokens required for tool-heavy workflows and preserves the cache, making requests faster and cheaper. It also enables agents to reliably work with much larger tool ecosystems. For MCP servers that may contain tens of thousands of tokens of tool definitions, the efficiency gains can be substantial.
To demonstrate the efficiency gains, we evaluated 250 tasks from Scale’s MCP Atlas (opens in a new window) benchmark with all 36 MCP servers enabled in two modes: (1) exposing every MCP function directly in the model context, and (2) placing all MCP servers behind tool search. The tool-search configuration reduced total token usage by 47% while achieving the same accuracy.
GPT‑5.4 also improves tool calling, making it more accurate and efficient when deciding when and how to use tools during reasoning, particularly in the API. Compared to GPT‑5.2, it achieves higher accuracy in fewer turns on Toolathlon, a benchmark that tests how well AI agents can use real-world tools and APIs to complete multi-step tasks. For example, an agent needs to read emails, extract assignment attachments, upload them, grade them and record results in a spreadsheet.
For latency-sensitive use cases where reasoning effort None is preferred, GPT‑5.4 further improves upon its predecessors.
Improved web search
GPT‑5.4 is better at agentic web search. On BrowseComp, a measurement of how well AI agents can persistently browse the web to find hard-to-locate information, GPT‑5.4 leaps 17% abs over GPT‑5.2, and GPT‑5.4 Pro sets a new state of the art of 89.3%.
In practice, this means GPT‑5.4 Thinking is stronger at answering questions that require pulling together information from many sources on the web. It can more persistently search across multiple rounds to identify the most relevant sources, particularly for “needle-in-a-haystack” questions, and synthesize them into a clear, well-reasoned answer.
“GPT-5.4 xhigh is the new state of the art for multi-step tool use. Zapier runs some of the most rigorous tool use benchmarks in the industry, testing models across hundreds of advanced real-world workflows. GPT-5.4 finished the job where previous models gave up - the most persistent model to date.”
— Wade, CEO at ZapierSteerability
Similarly to how Codex outlines its approach when it starts working, GPT‑5.4 Thinking in ChatGPT will now outline its work with a preamble for longer, more complex queries. You can also add instructions or adjust its direction mid-response. This makes it easier to guide the model toward the exact outcome you want without starting over or requiring multiple additional turns. This feature is available now on chatgpt.com (opens in a new window) and the Android app, coming soon to the iOS app.
The model can also think longer on difficult tasks while maintaining stronger awareness of earlier steps in the conversation. This allows it to handle longer workflows and more complex prompts while keeping answers coherent and relevant throughout.
Safety
Over recent months, we’ve continued improving the safeguards we introduced with GPT‑5.3‑Codex while preparing GPT‑5.4 for deployment. Similar to GPT‑5.3‑Codex, we are treating GPT‑5.4 as High cyber capability under our Preparedness Framework, and we are deploying it with the corresponding protections as documented in the system card (opens in a new window). These include an expanded cyber safety stack, including monitoring systems, trusted access controls, and asynchronous blocking for higher-risk requests for customers on Zero Data Retention (ZDR) surfaces, alongside ongoing investment in the broader security ecosystem.
Because cybersecurity capabilities are inherently dual-use, we maintain a precautionary approach to deployment while continuing to calibrate our policies and classifiers. For certain customers on ZDR surfaces, request-level blocking remains part of our cyber risk mitigation stack; because classifiers are still improving, some false positives may occur as we continue refining these safeguards. These updates aim to improve how safeguards operate in practice, including reducing unnecessary refusals and overly caveated responses, while preserving strong protections against misuse.
We’ve continued our safety research on Chain-of-Thought (CoT) monitorability to better understand how models reason and help detect potential misbehavior. As part of this work, we introduce a new open-source evaluation, CoT controllability, measuring whether models can deliberately obfuscate their reasoning to evade monitoring. We find that GPT‑5.4 Thinking’s ability to control its CoT is low, which is a positive property for safety, suggesting that the model lacks the ability to hide its reasoning and that CoT monitoring remains an effective safety tool.
Availability and pricing
GPT‑5.4 is rolling out gradually today across ChatGPT and Codex. In the API, GPT‑5.4 is available now as gpt-5.4. GPT‑5.4 Pro is also available in the API as gpt-5.4-pro for developers who need maximum performance on the most complex tasks.
In ChatGPT, GPT‑5.4 Thinking is available starting today to ChatGPT Plus, Team, and Pro users, replacing GPT‑5.2 Thinking. GPT‑5.2 Thinking will remain available for three months for paid users in the model picker under the Legacy Models section, after which it will be retired on June 5, 2026. Those on Enterprise and Edu plans can enable early access via admin settings. GPT‑5.4 Pro is available to Pro and Enterprise plans. Context windows (opens in a new window) in ChatGPT for GPT‑5.4 Thinking remain unchanged from GPT‑5.2 Thinking.
GPT‑5.4 is our first mainline reasoning model that incorporates the frontier coding capabilities of GPT‑5.3‑codex and that is rolling out across ChatGPT, the API and Codex. We're calling it GPT‑5.4 to reflect that jump, and to simplify the choice between models when using Codex. Over time, you can expect our Instant models and Thinking models to evolve at different speeds.
GPT‑5.4 in Codex includes experimental support for the 1M context window. Developers can try this by configuring model_context_window and model_auto_compact_token_limit. Requests that exceed the standard 272K context window count against usage limits at 2x the normal rate.
In the API, GPT‑5.4 is priced higher per token than GPT‑5.2 to reflect its improved capabilities, while its greater token efficiency helps reduce the total number of tokens required for many tasks. Batch and Flex pricing are available at half the standard API rate, while Priority processing is available at twice the standard API rate.
Original source Report a problem - Mar 5, 2026
- Date parsed from source:Mar 5, 2026
- First seen by Releasebot:Mar 5, 2026
0.110.0
A release with a vibrant feature push: a plugin system to load skills and app connectors from config or marketplace plus an install endpoint; richer TUI flow with approvals, ordinal nicknames, and better handoff; fast/flex service tiers and a /fast toggle; memory upgrades and Windows installer script.
New Features
- Added a plugin system that can load skills, MCP entries, and app connectors from config or a local marketplace, with an install endpoint for enabling plugins from the app server. (#12864, #13333, #13401, #13422)
- Expanded the TUI multi-agent flow with approval prompts, /agent-based enablement, clearer prompts, ordinal nicknames, and role-labeled handoff context. (#12995, #13246, #13404, #13412, #13505)
- Added a persisted /fast toggle in the TUI and app-server support for fast and flex service tiers. (#13212, #13334, #13391)
- Improved memories with workspace-scoped writes, renamed memory settings, and guardrails against saving stale or polluted facts. (#13008, #13088, #13237, #13467)
- Added a direct Windows installer script to published release artifacts. (#12741)
Bug Fixes
- Fixed @ file mentions so parent-directory .gitignore rules no longer hide valid repository files. (#13250)
- Made sub-agents faster and more reliable by reusing shell state correctly and fixing /status, Esc, pending-message handling, and startup/profile race conditions. (#12935, #13052, #13130, #13131, #13235, #13240, #13248)
- Fixed project trust parsing so CLI overrides apply correctly to trusted project-local MCP transports. (#13090)
- Fixed read-only sandbox policies so network access is preserved when it is explicitly enabled. (#13409)
- Fixed multiline environment export capture and Windows state DB path handling in session state. (#12642, #13336)
- Fixed ANSI/base16 syntax highlighting so terminal-themed colors render correctly in the TUI. (#13382)
Documentation
- Expanded app-server docs around service tiers, plugin installation, renaming unloaded threads, and the new skills/changed notification. (#13282, #13391, #13414, #13422)
Chores
- Removed the remaining legacy app-server v1 websocket/RPC surfaces in favor of the current protocol. (#13364, #13375, #13397)
Changelog
Full Changelog:
rust-v0.107.0...rust-v0.110.0- #13086 Fix CLI feedback link (@etraut-openai)
- #13063 Make cloud_requirements fail close (@alexsong-oai)
- #13083 Enable analytics in codex exec and codex mcp-server (@etraut-openai)
- #12995 feat: approval for sub-agent in the TUI (@jif-oai)
- #13027 feat: skill disable respect config layer (@jif-oai)
- #13125 chore: change mem default (@jif-oai)
- #13088 Tune memory read-path for stale facts (@andi-oai)
- #13128 nit: ignore `resume_startup_does_not_consume_model_availability_nux_c… (@jif-oai)
- #12935 Speed up subagent startup (@daveaitel-openai)
- #13127 nit: disable on windows (@jif-oai)
- #13129 fix: package models.json for Bazel tests (@jif-oai)
- #13065 core: resolve host_executable() rules during preflight (@bolinfest)
- #12864 feat: load from plugins (@xl-openai)
- #12989 fix: MacOSAutomationPermission::BundleIDs should allow communicating … (@leoshimo-oai)
- #13181 [codex] include plan type in account updates (@tibo-openai)
- #13058 Record realtime close marker on replacement (@aibrahim-oai)
- #13215 Fix issue deduplication workflow for Codex issues (@etraut-openai)
- #13197 Improve subagent contrast in TUI (@gabec-openai)
- #13008 feat: polluted memories (@jif-oai)
- #13237 feat: update memories config names (@jif-oai)
- #13052 core: reuse parent shell snapshot for thread-spawn subagents (@daveaitel-openai)
- #13249 chore: /multiagent alias for /agent (@jif-oai)
- #13057 fix: use https://git.savannah.gnu.org/git/bash instead of https://github.com/bolinfest/bash (@bolinfest)
- #13090 Fix project trust config parsing so CLI overrides work (@etraut-openai)
- #13202 tui: restore draft footer hints (@charley-oai)
- #13246 feat: enable ma through /agent (@jif-oai)
- #12642 fix(core) shell_snapshot multiline exports (@dylan-hurd-oai)
- #11814 test(app-server): increase flow test timeout to reduce flake (@joshka-oai)
- #13282 app-server: Update thread/name/set to support not-loaded threads (@euroelessar)
- #13285 feat(app-server): add tracing to all app-server APIs (@owenlin0)
- #13265 Update realtime websocket API (@aibrahim-oai)
- #13261 fix(app-server): emit turn/started only when turn actually starts (@owenlin0)
- #13079 app-server: Silence thread status changes caused by thread being created (@euroelessar)
- #13284 Adjusting plan prompt for clarity and verbosity (@bfioca-openai)
- #13286 feat(app-server-test-client): support tracing (@owenlin0)
- #13061 chore: remove SkillMetadata.permissions and derive skill sandboxing from permission_profile (@celia-oai)
- #12006 tui: preserve kill buffer across submit and slash-command clears (@rakan-oai)
- #13212 add fast mode toggle (@pash-openai)
- #13250 fix(core): scope file search gitignore to repository context (@fcoury)
- #13313 Renaming Team to Business plan during TUI onboarding (@bwanner-oai)
- #13248 fix: agent race (@jif-oai)
- #13235 fix: agent when profile (@jif-oai)
- #13336 fix: db windows path (@jif-oai)
- #13334 app-server service tier plumbing (plus some cleanup) (@pash-openai)
- #13341 feat: presentation artifact p1 (@jif-oai)
- #13344 feat: pres artifact 2 (@jif-oai)
- #13346 feat: pres artifact 3 (@jif-oai)
- #13345 feat: spreadsheet artifact (@jif-oai)
- #13347 feat: spreadsheet v2 (@jif-oai)
- #13348 feat: presentation part 4 (@jif-oai)
- #13350 feat: spreadsheet part 3 (@jif-oai)
- #13355 feat: pres artifact part 5 (@jif-oai)
- #13357 feat: add multi-actions to presentation tool (@jif-oai)
- #13360 feat: artifact presentation part 7 (@jif-oai)
- #13362 feat: wire spreadsheet artifact (@jif-oai)
- #12741 Add Windows direct install script (@efrazer-oai)
- #13376 realtime prompt changes (@aibrahim-oai)
- #13324 app-server-protocol: export flat v2 schema bundle (@apanasenko-oai)
- #13364 Remove Responses V1 websocket implementation (@pakrym-oai)
- #12969 app-server: source /feedback logs from sqlite at trace level (@charley-oai)
- #13381 chore: rm --all-features flag from rust-analyzer (@sayan-oai)
- #13043 Collapse parsed command summaries when any stage is unknown (@nornagon-openai)
- #13385 Revert "realtime prompt changes" (@aibrahim-oai)
- #13389 fix (@aibrahim-oai)
- #13375 chore(app-server): delete v1 RPC methods and notifications (@owenlin0)
- #13397 chore(app-server): restore EventMsg TS types (@owenlin0)
- #13395 Build delegated realtime handoff text from all messages (@aibrahim-oai)
- #13399 Require deduplicator success before commenting (@etraut-openai)
- #13398 Revert "Revert "realtime prompt changes"" (@aibrahim-oai)
- #13333 Refactor plugin config and cache path (@xl-openai)
- #13275 fix(network-proxy): reject mismatched host headers (@viyatb-oai)
- #12868 tui: align pending steers with core acceptance (@charley-oai)
- #13280 Add thread metadata update endpoint to app server (@joeytrasatti-openai)
- #13050 Add under-development original-resolution view_image support (@fjord-oai)
- #13402 Ensure the env values of imported shell_environment_policy.set is string (@alexsong-oai)
- #13331 Make js_repl image output controllable (@fjord-oai)
- #13401 feat: load plugin apps (@sayan-oai)
- #13292 [feedback] diagnostics (@rhan-oai)
- #13414 feat(app-server): add a skills/changed v2 notification (@owenlin0)
- #13368 feat(app-server): propagate app-server trace context into core (@owenlin0)
- #13413 copy command-runner to CODEX_HOME so sandbox users can always execute it (@iceweasel-oai)
- #13366 [bazel] Bump rules_rs and llvm (@zbarsky-openai)
- #13409 Feat: Preserve network access on read-only sandbox policies (@celia-oai)
- #13388 config: enforce enterprise feature requirements (@bolinfest)
- #13218 Add role-specific subagent nickname overrides (@gabec-openai)
- #13427 chore: Nest skill and protocol network permissions under network.enabled (@celia-oai)
- #13429 core: box wrapper futures to reduce stack pressure (@bolinfest)
- #13391 support 'flex' tier in app-server in addition to 'fast' (@kharvd)
- #13290 image-gen-core (@won-openai)
- #13404 feat: better multi-agent prompt (@jif-oai)
- #13412 feat: ordinal nick name (@jif-oai)
- #13454 add metric for per-turn token usage (@jif-oai)
- #13240 fix: pending messages in /agent (@jif-oai)
- #13461 fix: bad merge (@jif-oai)
- #13460 feat: disable request input on sub agent (@jif-oai)
- #13456 feat: add metric for per-turn tool count and add tmp_mem flag (@jif-oai)
- #13468 nit: citation prompt (@jif-oai)
- #13467 feat: memories in workspace write (@jif-oai)
- #12383 add new scopes to login (@adaley-openai)
- #13484 allow apps to specify cwd for sandbox setup. (@iceweasel-oai)
- #13424 feat(core, tracing): add a span representing a turn (@owenlin0)
- #13489 remove serviceTier from app-server examples (@kharvd)
- #13458 [tui] Update Fast slash command description (@pash-openai)
- #13382 fix(tui): decode ANSI alpha-channel encoding in syntax themes (@fcoury)
- #13485 feat: external artifacts builder (@jif-oai)
- #13493 feat(app-server-test-client): OTEL setup for tracing (@owenlin0)
- #13501 add metrics for external config import (@alexsong-oai)
- #13495 Notify TUI about plan mode prompts and user input requests (@etraut-openai)
- #13506 [release] temporarily use thin LTO for releases (@bolinfest)
- #13505 Prefix handoff messages with role (@aibrahim-oai)
- #13422 plugin: support local-based marketplace.json + install endpoint. (@xl-openai)
- Mar 4, 2026
- Date parsed from source:Mar 4, 2026
- First seen by Releasebot:Mar 5, 2026
ChatGPT Enterprise/EDU by OpenAI
March 4, 2026
Codex app lands on Windows for ChatGPT Enterprise and Edu, enabling parallel agents with isolated worktrees and reviewable diffs synced to CLI and IDE
Codex app is now available on Windows for ChatGPT Enterprise and Edu workspaces that include Codex. Members can run multiple Codex agents in parallel from a Windows desktop surface, with isolated worktrees and reviewable diffs that stay interoperable with Codex in the CLI and IDE.
Permissions
Admins do not need to set up separate app-specific permissions for the app. Codex Local permissions continue to govern local usage, and Codex Cloud permissions govern delegated cloud tasks from the app and other cloud-based Codex surfaces. Learn more: Using Codex with your ChatGPT plan.
Original source Report a problem - Mar 4, 2026
- Date parsed from source:Mar 4, 2026
- First seen by Releasebot:Mar 5, 2026
March 4, 2026
Codex App for Windows arrives in ChatGPT Business workspaces with parallel agents, isolated worktrees, and CLI/IDE interoperability.
Codex app on Windows
Codex app is now available on Windows for ChatGPT Business workspaces that include Codex. It gives members a Windows desktop surface for running multiple Codex agents in parallel, with isolated worktrees, reviewable diffs, and interoperability with Codex in the CLI and IDE.
Members can sign in with ChatGPT from the app to start work from a Windows environment, and admins do not need to configure a separate app-specific permission model. The app uses the same workspace controls as other Codex surfaces. Learn more: Using Codex with your ChatGPT plan.
Original source Report a problem - Mar 4, 2026
- Date parsed from source:Mar 4, 2026
- First seen by Releasebot:Mar 5, 2026
March 4, 2026
Codex app lands on Windows for ChatGPT Codex plans, enabling parallel agents, editable worktrees, and seamless cross‑app sign‑in.
Codex app on Windows
The Codex app is now available on Windows for ChatGPT plans that include Codex. The app gives users a Windows desktop surface for running multiple Codex agents in parallel, with isolated worktrees and reviewable diffs that can be edited, discarded, or turned into a pull request.
Users can sign in with ChatGPT from the app and keep work moving across the app, CLI, and IDE without switching between tools for every task. Learn more:
Original source Report a problem
Using Codex with your ChatGPT plan. - Mar 4, 2026
- Date parsed from source:Mar 4, 2026
- First seen by Releasebot:Mar 5, 2026
Music 2.5+: Unlock instrumental music, break through style boundaries
MiniMax Music 2.5+ launches instrumental music creation, expanding from song generation to full instrumental scores across classical, electronic, ambient, and ethnic timbres. It enables film TV scoring, ads, and game soundtracks with cross‑style fusion and studio‑quality production.
MiniMax Music 2.5 Launch
Today, we are pleased to announce that MiniMax Music 2.5 has officially launched its instrumental music creation capability. MiniMax Music has always centered on song generation. Today, we are extending our capabilities to a more essential form of music 6 instrumental music. No vocals needed; the music itself becomes the expression.
Unlock All Styles
MiniMax Music supports diverse generation styles including classical orchestration, minimalism, modern electronic, ambient sounds, and natural soundscapes. It covers the full spectrum from quiet atmospheres to powerful, high-energy tracks, adapting to meditation, sleep aids, advertising, game scoring and other scenarios. MiniMax Music model can handle the complete complexity from "pure natural sound without instruments" to "multi-track instrumental arrangements," with style switching requiring no additional tuning 6generate and use immediately.
3b5
Sleep Aid Music
Prompt4ac A lullaby featuring music box as the primary timbre, with an extremely slow tempo, gentle melody, suitable for falling asleep late at night.3b5
Meditation
Prompt: Create extremely serene, extremely slow-paced meditation music. The background features soft, water-like flowing synth ambient pads, decorated with crisp Tibetan singing bowls and minimalist xylophone taps. The overall atmosphere is ethereal and deep, as if standing on a temple above the clouds, with no heavy percussion, designed to help listeners achieve deep inner peace.3b5
Natural Soundscape
Prompt4ac A healing late-night rainstorm, the crisp sound of raindrops hitting the roof and leaves, distant low and gentle thunder, minimalist, white noise.3b5
Advertising / Brand Video Intro
Prompt4ac A minimalist, tech-inspired brand intro track centered on pulsing synthesizers, precise and restrained in tone.3b5
Game Music
Prompt: Electric guitar-driven uplifting melody, adding passion to adventure and combat.The instrumental music capability also enables MiniMax Music to serve film and TV scoring directly. Films, short drama, documentaries, and TV series each have different scoring requirements. The model generates complete soundtracks matching narrative rhythm based on scene descriptions, covering various emotional types and atmospheric needs.
3b5
Film scoring
Prompt: A minimalist cinematic score driven by pulsing synthesizers, with tight and precise rhythms.Cross-Genre Fusion, Unleash Imagination
Beyond existing styles, MiniMax Music has strong style generalization capabilities, supporting cross-style tag combinations for generation.
Whether traditional instruments with modern electronic, or Eastern timbres with Western structures, the model can understand the tension between different styles and transform them into coherent musical language, rather than simple element collage.
This fusion is built on solid musicality 6rich harmonic layers, complete melodic progression with proper beginning, development, transition and conclusion logic, natural transitions from motif development to climax release. The more cross-style the work, the more it demonstrates the model's deep understanding of musical structure. In terms of audio quality, the sound field has distinct three-frequency separation, clear instrument separation with dynamic balance, each track has independent spatial positioning, ensuring professional production standards across different styles.
It is worth mentioning that MiniMax Music's understanding and reproduction of traditional Chinese musical instruments is at an industry-leading level. MiniMax Music can accurately present the tonal expressiveness and performance details of ethnic instruments such as flute, pipa, and guzheng, naturally integrating them into orchestral arrangements and modern production contexts.
3b5
Epic orchestral music
Prompt4ac Epic cinematic East Asian fusion, 136BPM, virtuosic Chinese bamboo flute (Dizi) leading a powerful orchestra. Intense Taiko drum beats, martial arts atmosphere, heroic and urgent. Dramatic shifts between fierce action and lyrical reflection. High energy, triumphant climax.3b5
Baroque Metal 6 Baroque d7 Hardcore Heavy Metal
Prompt: A gorgeous auditory metamorphosis. Crisp, rigorous Baroque harpsichord polyphonic melody, suddenly invaded by violent blast beats and heavily distorted heavy metal guitars. Complex classical harmonics perfectly fused with modern metal's aggressiveness, creating a grand and chaotic opera-style metal listening experience.3b5
Chinese Style d7 Fantasy Epic
Prompt: A Chinese-style pure music depicting an adventure in a vast fantasy world. The music atmosphere is hopeful, led by a retro cello solo melody, accompanied by rhythmic percussion. Overall dynamic range is wide, creating a rich sense of layering.Welcome to MiniMax Music 2.5+, unlock your musical creativity!
Product Experience:
https://www.minimax.io/audio/musicAPI Interface:
Original source Report a problem
https://platform.minimax.io/docs/api-reference/music-generation - Mar 3, 2026
- Date parsed from source:Mar 3, 2026
- First seen by Releasebot:Mar 4, 2026
Gemini CLI by Google
Release v0.32.1
What's Changed
- fix(patch): cherry-pick 0659ad1 to release/v0.32.0-pr-21042 to patch version v0.32.0 and create version 0.32.1 by @gemini-cli-robot in #21048
Full Changelog
v0.32.0...v0.32.1
Original source Report a problem