OpenAI Release Notes

Last updated: Apr 3, 2026

  • Apr 2, 2026
    • Date parsed from source:
      Apr 2, 2026
    • First seen by Releasebot:
      Apr 3, 2026
    OpenAI logo

    OpenAI

    Codex now offers pay-as-you-go pricing for teams

    OpenAI expands ChatGPT Business and Enterprise with Codex-only seats on pay-as-you-go pricing, no rate limits, and clearer token-based billing. It also lowers ChatGPT Business annual pricing and adds Plugins and Automations to help teams connect Codex to existing systems.

    We’re making it easier to just build things. Starting today, teams on ChatGPT Business and Enterprise can add Codex-only seats to their workspaces with pay-as-you-go pricing, giving full access to Codex without a fixed seat fee. Now, small groups can begin pilots, prove value in a few critical workflows, and easily expand from there.

    We’re also making Codex pricing easier to understand. Codex-only seats have no rate limits, and usage is billed on token consumption. This gives you a clearer view of how usage turns into spend and makes it easier to track costs across budgets, workflows, and teams.

    Teams that need broad ChatGPT access can continue using standard ChatGPT Business seats that do include Codex usage limits. To make that path more accessible, we’re lowering the annual price of ChatGPT Business from $25 to $20 per seat.

    The best way to get started is with the Codex app for macOS and Windows, and new capabilities like Plugins⁠ (opens in a new window) and Automations⁠ (opens in a new window) make it easier than ever to connect Codex to the systems teams already use.

    To support your adoption, eligible ChatGPT Business workspaces can receive $100 in credits for each new Codex-only team member that joins and starts using Codex, up to $500 per team, for a limited time. To activate the offer, add Codex-only seats to your workspace or create a new ChatGPT Business workspace⁠ (opens in a new window)¹.

    These changes are designed to make adoption easier at a time when Codex usage across teams is already accelerating.

    Codex adoption within teams has grown 6x this year

    Already, more than 9 million paying business users rely on ChatGPT for work and more than 2 million builders now use Codex every week. Within ChatGPT Business and Enterprise, the number of Codex users has grown 6x since January.

    Teams at companies like Notion, Ramp, Braintrust, and Wasmer are already using Codex to accelerate their engineering workflows. Across these teams, Codex provides faster execution, more repeatable workflows, and a clearer path from individual AI experiments to broad adoption.

    ¹ Learn more about promo terms and conditions here⁠ (opens in a new window).

    Original source Report a problem
  • Mar 24, 2026
    • Date parsed from source:
      Mar 24, 2026
    • First seen by Releasebot:
      Mar 25, 2026
    OpenAI logo

    OpenAI

    Powering Product Discovery in ChatGPT

    OpenAI expands shopping in ChatGPT with richer, more visual product discovery, side-by-side comparisons, and more up-to-date results powered by the Agentic Commerce Protocol. It also adds a Walmart in-ChatGPT app experience and rolls out to free, Go, Plus, and Pro users.

    Launching richer, more visually immersive shopping experiences powered by the Agentic Commerce Protocol

    More and more, people are starting their shopping in ChatGPT—to explore, compare, and figure out what to buy.

    Shopping on the web is easy if you already know what you want. But when you’re still deciding, it often means jumping between tabs, reading the same “best of” lists, and trying to piece together the right answer.

    ChatGPT solves that: figuring out what to buy. You can describe what you’re looking for, refine it in a conversation, and quickly compare options that fit your specific needs. Today, we’re making that experience better with richer and more visual shopping in ChatGPT. You can now browse products visually, compare options side-by-side, and get detailed, up-to-date information—all in one place. What used to take hours of searching and tab-hopping now happens in seconds.

    To power this, we’re expanding the Agentic Commerce Protocol (ACP) to support product discovery—bringing more complete, relevant, and up-to-date information directly into ChatGPT.

    Richer shopping experiences

    Instead of scrolling through pages of results, ChatGPT can help you find what you’re looking for based on your budget, preferences, and constraints—and surface products that fit. You can browse visually, upload images as inspiration for similar items, and refine results conversationally until you land on the right option.

    It’s also much easier to compare. Products are presented side by side with key details like price, reviews, and features, so you can quickly evaluate your options without jumping between sites.

    Under the hood, we’ve improved speed, relevance, and product coverage—so results are more up-to-date and more useful. That means less searching, fewer tabs, and faster decisions.

    For users, this turns shopping from a fragmented, time-consuming process into a single, seamless experience. For merchants, it brings higher-intent shoppers who are closer to making a decision.

    These updates are rolling out to all ChatGPT free, Go, Plus, and Pro users this week, with more to come as we continue to invest in product discovery with ChatGPT.

    The foundation for AI-native commerce

    To power this shopping experience, we’re extending ACP to be the connective layer between merchants and users throughout discovery.

    Through ACP, merchants share product feeds and promotions so their catalogs are fully represented in ChatGPT. We support multiple delivery paths, including through third-party providers like Salesforce and Stripe, so merchants can participate with the systems they already use. Over time, ACP will serve as a foundation for broader AI-native commerce experiences, including personalization, local availability, and ETAs.

    Already, leading retailers including Target, Sephora, Nordstrom, Lowe’s, Best Buy, The Home Depot, and Wayfair have integrated into ACP for discovery.

    For merchants on Shopify, product data is already integrated into ChatGPT through Shopify Catalog, helping products appear more accurately and completely in relevant user conversations. No additional work is required from individual merchants.

    "Millions of Shopify merchants are open for business in ChatGPT. When buyers look for products, the Shopify Catalog offers the most relevant options from the brands that people love. They can complete purchases on the merchant's online store through an in-app browser, so everything happens seamlessly, with the merchant's brand front and center. This is AI shopping at scale."

    — Mani Fazeli, VP, Product at Shopify

    Connecting ChatGPT users and merchants

    Our North star is providing the best experience for both consumers and merchants. The best consumer experiences are powered by deep partnership with merchants. To build these deep partnerships, we want to offer merchants options for how they convert consumers. We’ve found that the initial version of Instant Checkout did not offer the level of flexibility that we aspire to provide, so we’re allowing merchants to use their own checkout experiences while we focus our efforts on product discovery.

    Merchants interested in experimenting with deeper integrations and native experiences on ChatGPT continue to have the option to develop ChatGPT apps. Today, we are excited to share that Walmart is introducing an in-ChatGPT app experience that takes users from discovery in ChatGPT into a tailored Walmart environment that supports account linking, loyalty and Walmart payments. This new experience is now available in web browsers, with app access in iOS and Android to follow shortly.

    “By partnering closely with OpenAI, we’ve been able to learn together as we move quickly to shape what agentic commerce can become. Today’s launch brings Walmart directly into the ChatGPT experience, combining leading conversational AI with the decades of retail expertise we’ve built serving customers.”

    — Daniel Danker, EVP, AI Acceleration, Product and Design at Walmart

    Building for the future of shopping

    As with other parts of ChatGPT, we’re building iteratively. We’re learning from early launches, incorporating feedback from users and merchants, and continuing to improve the experience over time. These updates reflect our focus on where ChatGPT can add the most value in shopping today. We’re grateful to the partners building with us, and we look forward to sharing more as this work evolves.

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from OpenAI and hundreds of other software products.

  • Mar 17, 2026
    • Date parsed from source:
      Mar 17, 2026
    • First seen by Releasebot:
      Mar 18, 2026
    OpenAI logo

    OpenAI

    Introducing GPT‑5.4 mini and nano

    OpenAI releases GPT‑5.4 mini and nano, the smallest, fastest GPT‑5.4 variants for coding and subagents. Mini outperforms GPT‑5 mini on coding, reasoning, and multimodal tasks while running over 2x faster and approaching GPT‑5.4 on several benchmarks; nano offers a cheaper, high‑volume option. Now available in API, Codex, and ChatGPT.

    Fast and efficient models optimized for coding and subagents

    Today we’re releasing GPT‑5.4 mini and nano, our most capable small models yet. They bring many of the strengths of GPT‑5.4 to faster, more efficient models designed for high-volume workloads.

    GPT‑5.4 mini significantly improves over GPT‑5 mini across coding, reasoning, multimodal understanding, and tool use, while running more than 2x faster. It also approaches the performance of the larger GPT‑5.4 model on several evaluations, including SWE-Bench Pro and OSWorld-Verified.

    GPT‑5.4 nano is the smallest, cheapest version of GPT‑5.4 for tasks where speed and cost matter most. It is also a significant upgrade over GPT‑5 nano. We recommend it for classification, data extraction, ranking, and coding subagents that handle simpler supporting tasks.

    These models are built for the kinds of workloads where latency directly shapes the product experience: coding assistants that need to feel responsive, subagents that quickly complete supporting tasks, computer-using systems that capture and interpret screenshots, and multimodal applications that can reason over images in real-time. In these settings, the best model is often not the largest one—it’s the one that can respond quickly, use tools reliably, and still perform well on complex professional tasks.

    • The highest reasoning_effort available for GPT‑5 mini is 'high'.

    Here’s what our customers think after testing GPT‑5.4 mini and nano in their workflows:

    "GPT-5.4 mini delivers strong end-to-end performance for a model in this class. In our evaluations it matched or exceeded competitive models on several output tasks and citation recall at a much lower cost. It also achieved higher end-to-end pass rates and stronger source attribution than the larger GPT‑5.4 model."
    — Aabhas Sharma, CTO at Hebbia

    Coding

    GPT‑5.4 mini and nano are especially effective in coding workflows that benefit from fast iteration. The models handle targeted edits, codebase navigation, front-end generation, and debugging loops with low latency, making them a strong fit for coding tasks that need to be completed at faster speeds and lower costs.

    In benchmarks, GPT‑5.4 mini consistently outperforms GPT‑5‑mini at similar latencies and approaches GPT‑5.4‑level pass rates while running much faster, delivering one of the strongest performance-per-latency tradeoffs for coding workflows.

    We estimate latency by looking at the production behavior of our models, and simulating this offline. The latency estimate accounts for tool call duration (code execution time), sampled tokens, and input tokens. Real-world latency may vary substantially, and depends on many factors not captured in our simulation. Similarly, costs are estimated based on API pricing of these models at the time of writing. Costs may change in the future. Reasoning efforts were swept from low to xhigh.

    Subagents

    GPT‑5.4 mini is also a strong fit for systems that combine models of different sizes. In Codex, for example, a larger model like GPT‑5.4 can handle planning, coordination, and final judgment, while delegating to GPT‑5.4 mini subagents that handle narrower subtasks in parallel—like searching a codebase, reviewing a large file, or processing supporting documents. Learn how subagents work in Codex in the docs (opens in a new window).

    This pattern becomes more useful as smaller models get faster and more capable. Instead of using one model for everything, developers can compose systems where larger models decide what to do and smaller models execute quickly at scale. GPT‑5.4 mini is our strongest mini model yet for that style of workflow.

    Computer use

    GPT‑5.4 mini is also strong on multimodal tasks, particularly those related to computer use. The model can quickly interpret screenshots of dense user interfaces to complete computer use tasks with speed. On OSWorld-Verified, GPT‑5.4 mini approaches GPT‑5.4 while substantially outperforming GPT‑5 mini.

    Availability & pricing

    GPT‑5.4 mini is available today in the API, Codex, and ChatGPT.

    In the API, GPT‑5.4 mini supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills. It has a 400k context window and costs $0.75 per 1M input tokens and $4.50 per 1M output tokens.

    In Codex, GPT‑5.4 mini is available across the Codex app, CLI, IDE extension and web. It uses only 30% of the GPT‑5.4 quota, letting developers quickly handle simpler coding tasks in Codex for about one-third the cost. Codex can also delegate to GPT‑5.4 mini subagents so that less reasoning-intensive work runs on the cheaper model.

    In ChatGPT, GPT‑5.4 mini is available to Free and Go users via the “Thinking” feature in the + menu. For all other users, GPT‑5.4 mini is available as a rate limit fallback for GPT‑5.4 Thinking.

    GPT‑5.4 nano is only available in the API and costs $0.20 per 1M input tokens and $1.25 per 1M output tokens.

    For more information on the models’ safeguards, please check out the System Card addendum on our Deployment Safety Hub (opens in a new window).

    • The highest reasoning_effort available for GPT‑5 mini is 'high'.

    • Overall Edit Distance. OmniDocBench was run with reasoning_effort set to 'none' to reflect low-cost, low-latency performance.

    Original source Report a problem
  • Mar 10, 2026
    • Date parsed from source:
      Mar 10, 2026
    • First seen by Releasebot:
      Mar 11, 2026
    OpenAI logo

    OpenAI

    New ways to learn math and science in ChatGPT

    OpenAI introduces dynamic interactive visual explanations in ChatGPT for over 70 core math and science concepts. Users can manipulate variables and see real time how formulas and graphs respond, turning abstract ideas into hands-on experiments. Rolling out today to all logged-in users worldwide.

    Understanding concepts through interactive visuals

    Explore concepts with interactive visual explanations.

    ChatGPT has quickly become one of the most widely used tools for learning. Each week, 140 million people use ChatGPT to help them understand math and science concepts alone. People also come to ChatGPT to explore new topics, work through homework problems, prepare for exams, and break down concepts they’ve always found difficult.

    For many learners, math and science concepts feel abstract and hard to understand. In a recent
    Gallup (opens in a new window)
    survey, more than half of U.S. adults said they struggle with math, and many parents reported they don’t feel confident helping their children learn it.

    Today, we’re making learning these concepts in ChatGPT even more interactive with new dynamic visual explanations. Starting with more than 70 core math and science concepts, ChatGPT will guide learners by showing how formulas, variables, and relationships behave in real time. These experiences will be available globally across all plans starting today.

    Understanding concepts through interactive visuals

    Research
    suggests
    (opens in a new window)
    that visual, interaction based learning can lead to stronger conceptual understanding than traditional instruction for many students. When learners can manipulate variables and instantly see the effects, they may be better able to internalize the relationships behind mathematical and scientific concepts.

    Now when someone asks ChatGPT about one of the core topics, it can explain it and present an interactive visual module. Users can adjust variables, manipulate formulas, and instantly see how those changes affect graphs and outcomes—turning abstract equations into something they can experiment with directly.

    “What stands out is how strongly this feature emphasizes conceptual understanding. When learning math, understanding why something works and how ideas connect helps concepts stick long term. I especially appreciate how it doesn’t stop at the original question but actively prompts you to extend thinking and explore deeper connections.”
    — Anjini Grover, High School Mathematics Teacher

    To try it out, you can ask ChatGPT:

    • Help me understand the Pythagorean Theorem
    • Explain how PV=nRT works
    • How can I find the area of a circle?

    (Object distance and focal length sliders update solved image distance and reflected rays.)

    Our work to strengthen learning with ChatGPT

    Helping people explore ideas, experiment with concepts, and build deeper understanding is one of the most meaningful ways we can bring the benefits of AI to people everywhere.

    In early testing, college and high school age students said the interactive experience helped them better understand how variables relate to one another. Parents said it gave them a more dynamic way to walk through problems alongside their children. Educators said tools like this could help students understand how concepts work, instead of simply memorizing formulas.

    This is just the beginning. Over time, we plan to expand interactive learning with additional subjects and continue building tools that strengthen learning with ChatGPT. This work builds on experiences like
    study mode
    , introduced last year to help students work through problems step by step, and
    quizzes
    (opens in a new window)
    , which help users strengthen recall and prepare for exams.

    The research landscape on how AI affects learning is still taking shape, but recent studies—including our findings on
    study mode
    —show promising early signals. Through partners in OpenAI’s
    NextGenAI
    initiative and the
    OpenAI Learning Lab
    , we will continue to advance research to better understand how AI shapes learning over time. We intend to publish findings, shape future product experiences based on these insights, and work side by side with the broader education ecosystem to ensure AI benefits learners worldwide.

    Editor’s note: Interactive learning is rolling out starting today to all logged-in ChatGPT users. Today, the list of math and science topics is most relevant to high school and college age learners, and includes topics like binomial square, Charles’ law, circle area, circle equation, compound interest, cone surface area, cone volume, Coulomb’s law, cylinder volume, degrees of freedom, difference of squares, exponential decay, Hooke’s law, kinetic energy, lens equation, linear equation, Ohm’s law, period–frequency relation, potential energy, PV = nRT equation, Pythagorean theorem, slope–intercept form, surface area of sphere, triangle area, trig angle sum identity, and others.

    Original source Report a problem
  • Mar 6, 2026
    • Date parsed from source:
      Mar 6, 2026
    • First seen by Releasebot:
      Mar 7, 2026
    OpenAI logo

    OpenAI

    Codex Security: now in research preview

    OpenAI unveils Codex Security, an application security agent rolling out in research preview to ChatGPT Pro, Enterprise, Business and Edu. It promises high‑confidence findings, context‑driven validation, and actionable fixes, with early beta gains like reduced noise and faster remediation.

    Codex Security

    Today we’re introducing Codex Security, our application security agent. It builds deep context about your project to identify complex vulnerabilities that other agentic tools miss, surfacing higher-confidence findings with fixes that meaningfully improve the security of your system while sparing you from the noise of insignificant bugs.

    Context is essential when evaluating real security risks, but most AI security tools simply flag low-impact findings and false positives, forcing security teams to spend significant time on triage. At the same time, agents are accelerating software development, making security review an increasingly critical bottleneck. Codex Security addresses both challenges. By combining agentic reasoning from our frontier models with automated validation, it delivers high-confidence findings and actionable fixes so teams can focus on the vulnerabilities that matter and ship secure code faster.

    Formerly known as

    Aardvark⁠, Codex Security began last year as a private beta with a small group of customers. In early internal deployments, it surfaced a real SSRF, a critical cross-tenant authentication vulnerability, and many other issues which our security team patched within hours. Early deployments with external testers helped us improve how users provide relevant product context and move from onboarding to securing their code. We also significantly improved the quality of our findings over the course of the beta: scans on the same repositories over time show increasing precision, in one case cutting noise by 84% since initial rollout. We’ve reduced the rate of findings with over-reported severity by more than 90%, and false positive rates on detections have fallen by more than 50% across all repositories. These improvements help Codex Security better align reported severity with real-world risk and reduce unnecessary triage burden for security teams, and we expect the signal-to-noise ratio to continue to improve.

    Starting today, Codex Security is rolling out in research preview to ChatGPT Pro, Enterprise, Business, and Edu customers via Codex web with free usage for the next month.

    How Codex Security works

    Codex Security leverages OpenAI’s frontier models and the Codex agent. It can reduce noise and accelerate remediation by grounding vulnerability discovery, validation, and patching in system-specific context.

    • Build system context and create an editable threat model:
      After configuring a scan, it analyzes your repository to understand the security-relevant structure of the system and generates a project-specific threat model that can capture what the system does, what it trusts, and where it is most exposed. Threat models can be edited to keep the agent aligned with your team.

    • Prioritize and validate issues:
      Using the threat model as context, it searches for vulnerabilities and categorizes findings based on expected real-world impact in your system. Where possible, it pressure-tests findings in sandboxed validation environments to distinguish signal from noise. Users can see this analysis in the validated findings. When Codex Security is configured with an environment tailored to your project, it can validate potential issues directly in the context of the running system. That deeper validation can reduce false positives even further and enable the creation of working proof-of-concepts, giving security teams stronger evidence and a clearer path to remediation.

    • Patch issues with full system context:
      Finally, Codex Security proposes fixes to the discovered issues that align with system intent and surrounding behavior. This enables patches that can improve security while minimizing regressions, making them safer to review and land. Users can filter the findings so they stay focused on what matters most to their team and has the highest security impact.

    Codex Security can also learn from your feedback over time to improve the quality of its findings. When you adjust the criticality of a finding, it can use that feedback to refine the threat model and improve precision on subsequent runs as it learns what matters in your architecture and risk posture.

    It’s designed to operate at scale and surface the highest-confidence findings with easy-to-accept patches. Over the last 30 days, Codex Security scanned more than 1.2 million commits across external repositories in our beta cohort, identifying 792 critical findings and 10,561 high-severity findings. Critical issues appeared in under 0.1% of scanned commits, showing that the system can identify security impacting issues in large volumes of code while minimizing noise to reviewers.

    "As a company laser-focused on product security, NETGEAR was pleased to join the early access program, and the results exceeded expectations. Codex Security integrated effortlessly into our robust security development environment, strengthening the pace and depth of our review processes. Its findings were impressively clear and comprehensive, often giving the sense that an experienced product security researcher was working alongside us."
    — Chandan Nandakumaraiah, Head of Product Security at NETGEAR and Member of CVE Board

    Supporting the open source community

    Open source software forms the foundation of modern systems, including our own. We've been using Codex Security to scan the open-source repositories we rely on most, sharing high impact security findings we identify with maintainers to help strengthen that foundation.

    In our conversations with maintainers, a consistent theme emerged: the challenge isn’t a lack of vulnerability reports, but too many low-quality ones. Maintainers told us they need fewer false positives and a more sustainable way to surface real security issues without creating additional triage burden. These conversations helped shape how we’re supporting the open source community with Codex Security. Rather than generating large volumes of speculative findings, we are building a system that prioritizes high-confidence issues that maintainers can act on quickly.

    As part of this work, we reported critical vulnerabilities to a number of widely used open-source projects including
    OpenSSH⁠ (opens in a new window) ,
    GnuTLS⁠ (opens in a new window) ,
    GOGS⁠ (opens in a new window) ,
    Thorium⁠ (opens in a new window) libssh, PHP, and Chromium, and more. Fourteen CVEs have been assigned with dual reporting on two — we've shared some examples in the Appendix.

    We recently started onboarding an initial cohort of open-source maintainers into Codex for OSS, our program to support the ecosystem with free ChatGPT Pro and Plus accounts, code review, and Codex Security. Projects like vLLM have already used Codex Security to find and patch issues as part of their normal workflow.

    We plan to expand the program in the coming weeks so more maintainers have a direct path to better security, stronger review workflows, and support for the open-source work the ecosystem depends on. If you’re an open-source maintainer and interested,
    please get in touch⁠.

    Get started

    We’ll be rolling out Codex Security access to ChatGPT Enterprise, Business, and Edu customers over the coming days. Check out
    our docs⁠ (opens in a new window) to learn more about setting up Codex Security for your team.

    Appendix

    Examples of high impact OSS vulnerabilities discovered by Codex Security:

    • GnuTLS certtool Heap-Buffer Overflow (Off-by-One) — CVE-2025-32990 ⁠ (opens in a new window)
    • GnuTLS Heap Buffer Overread in SCT Extension Parsing — CVE-2025-32989 ⁠ (opens in a new window)
    • GnuTLS Double-Free in otherName SAN Export — CVE-2025-32988 ⁠ (opens in a new window)
    • 2FA Bypass GOGS — CVE-2025-64175 ⁠ (opens in a new window)
    • Unauth bypass GOGS — CVE-2026-25242 ⁠ (opens in a new window)
    • Path traversal (arbitrary write) — download_ephemeral, download_children (agent) — CVE-2025-35430 ⁠ (opens in a new window)
    • LDAP injection (filters & DN) — LdapUserMap::new / get_unix_info / basic_auth_ldap — CVE-2025-35431 ⁠ (opens in a new window)
    • Unauthenticated DoS & mail abuse — resend_email_verification — CVE-2025-35432 ⁠ (opens in a new window) , CVE-2025-35436 ⁠ (opens in a new window)
    • Session not rotated on password change — User::update_user — CVE-2025-35433 ⁠ (opens in a new window)
    • Disabled TLS verification — Elasticsearch client — CVE-2025-35434 ⁠ (opens in a new window)
    • DoS: division by zero — /api/streams/depth/.../{split} — CVE-2025-35435 ⁠ (opens in a new window)
    • gpg-agent stack buffer overflow via PKDECRYPT --kem=CMS (ECC KEM) — CVE-2026-24881 ⁠ (opens in a new window)
    • Stack-based buffer overflow in TPM2 PKDECRYPT for RSA and ECC due to missing ciphertext length validation — CVE-2026-24882 ⁠ (opens in a new window)
    • CMS/PKCS7 AES-GCM ASN.1 params stack buffer overflow — CVE-2025-15467 ⁠ (opens in a new window)
    • PKCS#12 PBMAC1 PBKDF2 keyLength overflow + MAC bypass — CVE-2025-11187 ⁠ (opens in a new window)
    Original source Report a problem
  • Mar 5, 2026
    • Date parsed from source:
      Mar 5, 2026
    • First seen by Releasebot:
      Mar 6, 2026
    OpenAI logo

    OpenAI

    Introducing ChatGPT for Excel and new financial data integrations

    OpenAI announces ChatGPT for Excel in beta, an Excel add-in that builds, updates, and analyzes models directly in workbooks. It adds financial data integrations and uses GPT-5.4 Thinking to boost finance workflows. Rollout starts for select business plans in US, Canada, Australia with enterprise controls.

    Use ChatGPT in Excel to build, update, and analyze spreadsheets faster, and new integrations in ChatGPT for financial workflows.

    Today, we’re introducing ChatGPT for Excel⁠ (opens in a new window) in beta, an Excel add-in that brings ChatGPT directly into workbooks to help build and update models, run scenarios, and generate outputs based on cells and formulas. Powered by GPT‑5.4, it helps users do more in Excel, supports power users in moving faster, and can improve consistency across teams.

    We’re also adding financial data integrations directly in ChatGPT for FactSet, Dow Jones Factiva, LSEG, Daloopa, S&P Global, and more, making it easier to work with trusted financial data inside ChatGPT. Together, these capabilities help teams spend less time on manual work and more time on analysis, decisions, and execution.

    An AI model optimized for finance workflows

    GPT‑5.4 (as GPT‑5.4 Thinking) is available today in ChatGPT, Codex, and the API. It’s our most advanced model, ideal for financial reasoning and Excel-based modeling. We’ve worked closely with industry practitioners to improve GPT‑5.4 on real-world finance workflows that often take analysts hours or days to complete, including financial modeling, scenario analysis, data extraction, and long-form research. The result is stronger performance on the tasks finance professionals rely on every day.

    On OpenAI’s internal investment banking benchmark, which evaluates real-world workflows such as building a three-statement model with proper formatting and citations, performance improved from 43.7% with GPT‑5 to 87.3% with GPT‑5.4 Thinking.

    ChatGPT for Excel in beta: build, update, and analyze spreadsheet models directly in your workbook

    We’re introducing ChatGPT for Excel in beta—a version of ChatGPT embedded directly in spreadsheets that can build, analyze, and update models using the same formulas and structures teams already rely on. Analysts, strategists, researchers, and accountants can move faster, reduce manual work, and focus on judgment and decision-making instead of writing formulas, tracing links, and fixing models.

    How it works

    • Build and update spreadsheet models faster. Instead of building spreadsheet models or running scenario analysis manually, teams can describe what they need in plain language, and ChatGPT will create or update live Excel models directly in the workbook. Teams can run data analysis, reporting, inventory management, budgeting—all while preserving structure, formulas, and assumptions in a formatted, Excel-native workbook.

    • Get insights from large spreadsheets without manual reconciliation. ChatGPT can reason across workbooks, understand how sheets and formulas connect across the model, explain why outputs changed, trace and fix errors, and show how assumptions flow through a model. This is especially useful when users inherit existing templates, need to get up to speed quickly, or want to understand and test a workbook before making decisions.

    • Follow the logic and trust the outputs. ChatGPT explains what it’s doing as it works and links answers to the exact cells it references and updates. Because calculations run directly in Excel, teams can trace assumptions, audit formulas, and verify how results were produced. Before making changes to a workbook, ChatGPT asks for permission, so users can review each step and undo edits if needed.

    Known limitations in beta

    We are improving ChatGPT for Excel quickly based on user feedback. Some responses may take longer as we optimize performance, and generated outputs may occasionally require cleanup or adjustment to match preferred spreadsheet formatting or layout conventions. ChatGPT can generate and explain formulas, but complex formulas or edge cases may still require manual refinement.

    Getting started

    Starting today, ChatGPT for Excel⁠ (opens in a new window) in beta is rolling out for ChatGPT Business, Enterprise, Edu, Teachers, Pro, and Plus users in the U.S., Canada, and Australia. ChatGPT for Google Sheets is coming soon.

    In Enterprise, Edu, and Teacher workspaces, access is off by default. Admins can enable it for specific users with custom roles and group permissions.

    Financial data integrations in ChatGPT

    For teams working in financial workflows, new data integrations and support for proprietary data through building your own apps using Model Context Protocol (MCP)⁠ (opens in a new window), make it easier to bring market, company, and internal data into a single workflow in ChatGPT. With GPT‑5.4, ChatGPT can handle longer context and more complex tasks, helping teams move faster on company research, model refreshes, and cited outputs for valuation, diligence, underwriting, and related work.

    Simplify research and analysis

    Integrations released today—including Moody’s, Dow Jones Factiva, MSCI, Third Bridge, and MT Newswire, with FactSet coming soon—bring market, company, and internal data into a single workflow in ChatGPT, as part of a growing ecosystem of apps. This helps users spend less time gathering inputs to produce cited outputs such as earnings summaries, valuation snapshots, and credit memos faster.

    Quickly conduct due diligence

    Teams can also use apps with research in ChatGPT to pull from filings, transcripts, decks, and spreadsheets to produce structured, cited outputs that export to PDF or Microsoft Word. Recent updates give users more control over the research process, including the ability to focus on specific websites and data sources, shape the research plan before and during a run, and review sources and citations in a redesigned workspace.

    Security, governance, and control

    For organizations adopting ChatGPT at work, ChatGPT Enterprise includes the security, governance, and access controls needed to use ChatGPT confidently, especially in regulated or data-sensitive environments:

    • Manage and monitor access with RBAC, SAML SSO, SCIM, and audit logs, with support for common DLP and SIEM tools.

    • Protect firm data with encryption in transit with TLS 1.2+ and at rest with AES-256, plus enterprise key management support.

    • Meet regional data requirements with data residency and regional processing controls.

    • By default, data shared with ChatGPT Enterprise is not used to train or improve our models.

    Learn more about our enterprise-grade security, privacy, and compliance programs. Explore apps today⁠ (opens in a new window), or contact our team⁠ to learn more.

    Customer impact

    We’re working closely with financial institutions as they apply ChatGPT across research, underwriting, auditing, client engagement, code modernization, and operations. Across banks, asset managers, and insurance, we’re seeing impact in workflows like due diligence, client experience, and investment research—and we’ll keep learning alongside customers as they scale their AI deployments.

    “ChatGPT has materially accelerated our research and due diligence workflows—from financial analysis and market research to legal review and writing internal memos—while improving consistency across teams. It has expanded our team’s capacity, freeing our investment professionals to focus more time on judgment, debate, and conviction. We’re excited to be early adopters of new capabilities and to help shape how AI transforms financial services in the years ahead.”

    —Amr Ellabban, PhD, Head of AI, Hg

    Looking ahead

    This launch builds on OpenAI’s ongoing work with analysts, strategists, researchers, and accountants. We’re learning from real-world deployments to improve our products and models and help institutions move faster while operating responsibly in regulated environments.

    To learn more, contact our team⁠. Enterprise customers can build directly with OpenAI or work alongside experienced partners such as Accenture, Bain, Boston Consulting Group (BCG), McKinsey & Company, and PwC to integrate AI into existing data, applications, and operating models.

    Original source Report a problem
  • Mar 5, 2026
    • Date parsed from source:
      Mar 5, 2026
    • First seen by Releasebot:
      Mar 6, 2026
    OpenAI logo

    OpenAI

    Introducing GPT-5.4 | OpenAI

    GPT‑5.4 debuts across ChatGPT, API and Codex with Thinking and Pro variants, boosting reasoning, coding and real‑world workflow ability. New tool search, 1M context, native computer use, faster token efficiency, and stronger web search. Includes steerable conversations and enhanced safety measures.

    Today, we’re releasing GPT‑5.4 in ChatGPT (as GPT‑5.4 Thinking), the API, and Codex. It’s our most capable and efficient frontier model for professional work. We’re also releasing GPT‑5.4 Pro in ChatGPT and the API, for people who want maximum performance on complex tasks.

    GPT‑5.4 brings together the best of our recent advances in reasoning, coding, and agentic workflows into a single frontier model. It incorporates the industry-leading coding capabilities of GPT‑5.3‑Codex⁠ while improving how the model works across tools, software environments, and professional tasks involving spreadsheets, presentations, and documents. The result is a model that gets complex real work done accurately, effectively, and efficiently—delivering what you asked for with less back and forth.

    In ChatGPT, GPT‑5.4 Thinking can now provide an upfront plan of its thinking, so you can adjust course mid-response while it’s working, and arrive at a final output that’s more closely aligned with what you need without additional turns. GPT‑5.4 Thinking also improves deep web research, particularly for highly specific queries, while better maintaining context for questions that require longer thinking. Together, these improvements mean higher-quality answers that arrive faster and stay relevant to the task at hand.

    In Codex and the API, GPT‑5.4 is the first general-purpose model we’ve released with native, state-of-the-art computer-use capabilities, enabling agents to operate computers and carry out complex workflows across applications. It supports up to 1M tokens of context, allowing agents to plan, execute, and verify tasks across long horizons. GPT‑5.4 also improves how models work across large ecosystems of tools and connectors with tool search, helping agents find and use the right tools more efficiently without sacrificing intelligence. Finally, GPT‑5.4 is our most token efficient reasoning model yet, using significantly fewer tokens to solve problems when compared to GPT‑5.2—translating to reduced token usage and faster speeds.

    Together with advances in general reasoning, coding, and professional knowledge work, GPT‑5.4 enables more reliable agents, faster developer workflows, and higher-quality outputs across ChatGPT, the API, and Codex.

    Knowledge work

    Building on GPT‑5.2’s general reasoning capabilities, GPT‑5.4 delivers even more consistent and polished results on real-world tasks that matter to professionals.

    On GDPval⁠, which tests agents’ abilities to produce well-specified knowledge work across 44 occupations, GPT‑5.4 achieves a new state of the art, matching or exceeding industry professionals in 83.0% of comparisons, compared to 70.9% for GPT‑5.2.

    In GDPval, models attempt well-specified knowledge work spanning 44 occupations from the top 9 industries contributing to U.S. GDP. Tasks request real work products, such as sales presentations, accounting spreadsheets, urgent care schedules, manufacturing diagrams, or short videos. Reasoning effort was set to xhigh for GPT‑5.4 and heavy for GPT‑5.2 (a slightly lower level in ChatGPT).

    “GPT-5.4 is the best model we’ve ever tried. It’s now top of the leaderboard on our APEX-Agents benchmark, which measures model performance for professional services work. It excels at creating long-horizon deliverables such as slide decks, financial models, and legal analysis, delivering top performance while running faster and at a lower cost than competitive frontier models.”
    — Brendan Foody, CEO at Mercor

    We put a particular focus on improving GPT‑5.4’s ability to create and edit spreadsheets, presentations, and documents. On an internal benchmark of spreadsheet modeling tasks that a junior investment banking analyst might do, GPT‑5.4 achieves a mean score of 87.3%, compared to 68.4% for GPT‑5.2. On a set of presentation evaluation prompts, human raters preferred presentations from GPT‑5.4 68.0% of the time over those from GPT‑5.2 due to stronger aesthetics, greater visual variety, and more effective use of image generation.

    Documents were generated with reasoning effort set to xhigh

    You can try these capabilities in ChatGPT using GPT‑5.4 Thinking or Pro. If you’re an Enterprise customer, we recommend using our newly released ChatGPT for Excel add-in⁠ (opens in a new window), which was also launched today. We've also updated our spreadsheet⁠ (opens in a new window) and presentation skills⁠ (opens in a new window) available in Codex and the API.

    To make GPT‑5.4 better at real-world work, we continued our progress at driving down hallucinations and errors. GPT‑5.4 is our most factual model yet: on a set of de-identified prompts where users flagged factual errors, GPT‑5.4’s individual claims are 33% less likely to be false and its full responses are 18% less likely to contain any errors, relative to GPT‑5.2.

    “GPT-5.4 sets a new bar for document-heavy legal work. On our BigLaw Bench eval, it scored 91%. Compared to other models, GPT‑5.4 is currently better at structuring complex transactional analysis, maintaining accuracy across lengthy contracts, and delivering the high level of detail legal practitioners require.”
    — Niko Grupen, Head of Applied Research at Harvey

    Computer use and vision

    GPT‑5.4 is our first general-purpose model with native computer-use capabilities and marks a major step forward for developers and agents alike. It’s the best model currently available for developers building agents that complete real tasks across websites and software systems.

    We’ve designed GPT‑5.4 to be performant across a wide range of computer-use workloads. It is excellent at writing code to operate computers via libraries like Playwright, as well as issuing mouse and keyboard commands in response to screenshots. Its behavior is steerable via developer messages, meaning that developers can adjust behavior to suit particular use cases. Developers can even configure the model’s safety behavior to suit different levels of risk tolerance by specifying custom confirmation policies.

    The model’s performance and flexibility are reflected across benchmarks that test computer use across different settings. On OSWorld-Verified, which measures a model’s ability to navigate a desktop environment through screenshots and keyboard/mouse actions, GPT‑5.4 achieves a state-of-the-art 75.0% success rate, far exceeding GPT‑5.2’s 47.3%, and surpassing human performance at 72.4%.

    On WebArena-Verified, which tests browser use, GPT‑5.4 achieves a leading 67.3% success rate when using both DOM- and screenshot-driven interaction, compared to GPT‑5.2’s 65.4%. On Online-Mind2Web, which also tests browser use, GPT‑5.4 achieves a 92.8% success rate using screenshot-based observations alone, improving over ChatGPT Atlas’s Agent Mode, which achieves a success rate of 70.9%.

    GPT‑5.4 interprets screenshots of a browser interface and interacts with UI elements through coordinate-based clicking to send emails and schedule a calendar event. Video is not sped up.

    GPT‑5.4’s improved computer use is built on the model’s improved general visual perception capabilities. On MMMU-Pro, a test of a model’s visual understanding and reasoning, GPT‑5.4 achieves an 81.2% success rate without tool use, an improvement over GPT‑5.2’s 79.5%. Improved visual perception also translates into better document parsing capabilities. On OmniDocBench, GPT‑5.4 without reasoning effort achieves an average error (measured by normalized edit distance between model prediction and ground truth) of 0.109, improved from GPT‑5.2’s 0.140.

    MMMUPro was run with reasoning effort set to xhigh. OmniDocBench was run with reasoning effort set to none, to reflect low-cost, low-latency performance.

    We’re also improving visual understanding for dense, high-resolution images where full fidelity matters. Starting with GPT‑5.4, we’re introducing an original image input detail⁠ (opens in a new window) level which supports full-fidelity perception up to 10.24M total pixels or 6000-pixel maximum dimension, whichever is lower; the high image input detail level now supports up to 2.56M total pixels or a 2048-pixel maximum dimension. In early testing with API users, we observed strong gains in localization ability, image understanding, and click accuracy when using original or high detail.

    “In our evals measuring computer use performance across ~30K HOA and property tax portals, GPT-5.4 achieved a 95% success rate on the first attempt and 100% within three attempts, compared to ~73–79% with prior CUA models. It also completed sessions ~3x faster while using ~70% fewer tokens, materially improving reliability and cost efficiency at scale.”
    — Dod Fraser, CEO at Mainstay

    In the API, developers can access these capabilities using the updated computer tool. Please see our updated documentation⁠ (opens in a new window) for recommended best practices.

    Coding

    GPT‑5.4 combines the coding strengths of GPT‑5.3‑Codex with leading knowledge work and computer-use capabilities, which matter most on longer-running tasks where the model can use tools, iterate, and push work further with less manual intervention. It matches or outperforms GPT‑5.3‑Codex on SWE-Bench Pro while being lower latency across reasoning efforts.

    We estimate latency by looking at the production behavior of our models, and simulating this offline. The latency estimate accounts for tool call duration (code execution time), sampled tokens, and input tokens. Real-world latency may vary substantially, and depends on many factors not captured in our simulation. Reasoning efforts were swept from none to xhigh.

    When toggled on, /fast mode in Codex delivers up to 1.5x faster token velocity with GPT‑5.4. It’s the same model and the same intelligence, just faster. That means users can move through coding tasks, iteration, and debugging while staying in flow. Developers can access GPT‑5.4 at the same fast speeds via the API by using priority processing⁠ (opens in a new window).

    In evaluation and internal testing we found that GPT‑5.4 excels at complex frontend tasks, with noticeably more aesthetic and more functional results than any models we’ve launched previously.

    As a demonstration of the model’s improved computer-use and coding capabilities working in tandem, we’re also releasing an experimental Codex skill called “Playwright (Interactive)⁠ (opens in a new window)”. This allows Codex to visually debug web and Electron apps; it can even be used to test an app it’s building, as it’s building it.

    “GPT-5.4 is currently the leader on our internal benchmarks. Our engineers find it to be more natural and assertive than previous models. It works through ambiguous problems without second-guessing itself, and it's proactive about parallelizing work to keep things moving.”
    — Lee Robinson, VP of Developer Education at Cursor

    Tool use

    With GPT‑5.4, we’ve significantly improved how models work with external tools. Agents can now operate across larger tool ecosystems, choose the right tools more reliably, and complete multi-step workflows with lower cost and latency.

    Tool search

    In the API, GPT‑5.4 introduces tool search⁠ (opens in a new window), which allows models to work efficiently when given many tools.

    Previously, when a model was given tools, all tool definitions were included in the prompt upfront. For systems with many tools, this could add thousands—or even tens of thousands—of tokens to every request, increasing cost, slowing responses, and crowding the context with information the model might never use.

    With tool search, GPT‑5.4 instead receives a lightweight list of available tools along with a tool search capability. When the model needs to use a tool, it can look up that tool’s definition and append it to the conversation at that moment.

    This approach dramatically reduces the number of tokens required for tool-heavy workflows and preserves the cache, making requests faster and cheaper. It also enables agents to reliably work with much larger tool ecosystems. For MCP servers that may contain tens of thousands of tokens of tool definitions, the efficiency gains can be substantial.

    To demonstrate the efficiency gains, we evaluated 250 tasks from Scale’s MCP Atlas⁠ (opens in a new window) benchmark with all 36 MCP servers enabled in two modes: (1) exposing every MCP function directly in the model context, and (2) placing all MCP servers behind tool search. The tool-search configuration reduced total token usage by 47% while achieving the same accuracy.

    GPT‑5.4 also improves tool calling, making it more accurate and efficient when deciding when and how to use tools during reasoning, particularly in the API. Compared to GPT‑5.2, it achieves higher accuracy in fewer turns on Toolathlon, a benchmark that tests how well AI agents can use real-world tools and APIs to complete multi-step tasks. For example, an agent needs to read emails, extract assignment attachments, upload them, grade them and record results in a spreadsheet.

    For latency-sensitive use cases where reasoning effort None is preferred, GPT‑5.4 further improves upon its predecessors.

    Improved web search

    GPT‑5.4 is better at agentic web search. On BrowseComp, a measurement of how well AI agents can persistently browse the web to find hard-to-locate information, GPT‑5.4 leaps 17% abs over GPT‑5.2, and GPT‑5.4 Pro sets a new state of the art of 89.3%.

    In practice, this means GPT‑5.4 Thinking is stronger at answering questions that require pulling together information from many sources on the web. It can more persistently search across multiple rounds to identify the most relevant sources, particularly for “needle-in-a-haystack” questions, and synthesize them into a clear, well-reasoned answer.

    “GPT-5.4 xhigh is the new state of the art for multi-step tool use. Zapier runs some of the most rigorous tool use benchmarks in the industry, testing models across hundreds of advanced real-world workflows. GPT-5.4 finished the job where previous models gave up - the most persistent model to date.”
    — Wade, CEO at Zapier

    Steerability

    Similarly to how Codex outlines its approach when it starts working, GPT‑5.4 Thinking in ChatGPT will now outline its work with a preamble for longer, more complex queries. You can also add instructions or adjust its direction mid-response. This makes it easier to guide the model toward the exact outcome you want without starting over or requiring multiple additional turns. This feature is available now on chatgpt.com⁠ (opens in a new window) and the Android app, coming soon to the iOS app.

    The model can also think longer on difficult tasks while maintaining stronger awareness of earlier steps in the conversation. This allows it to handle longer workflows and more complex prompts while keeping answers coherent and relevant throughout.

    Safety

    Over recent months, we’ve continued improving the safeguards we introduced with GPT‑5.3‑Codex while preparing GPT‑5.4 for deployment. Similar to GPT‑5.3‑Codex, we are treating GPT‑5.4 as High cyber capability under our Preparedness Framework, and we are deploying it with the corresponding protections as documented in the system card⁠ (opens in a new window). These include an expanded cyber safety stack, including monitoring systems, trusted access controls, and asynchronous blocking for higher-risk requests for customers on Zero Data Retention (ZDR) surfaces, alongside ongoing investment in the broader security ecosystem.

    Because cybersecurity capabilities are inherently dual-use, we maintain a precautionary approach to deployment while continuing to calibrate our policies and classifiers. For certain customers on ZDR surfaces, request-level blocking remains part of our cyber risk mitigation stack; because classifiers are still improving, some false positives may occur as we continue refining these safeguards. These updates aim to improve how safeguards operate in practice, including reducing unnecessary refusals and overly caveated responses, while preserving strong protections against misuse.

    We’ve continued our safety research on Chain-of-Thought (CoT) monitorability to better understand how models reason and help detect potential misbehavior. As part of this work, we introduce a new open-source evaluation, CoT controllability⁠, measuring whether models can deliberately obfuscate their reasoning to evade monitoring. We find that GPT‑5.4 Thinking’s ability to control its CoT is low, which is a positive property for safety, suggesting that the model lacks the ability to hide its reasoning and that CoT monitoring remains an effective safety tool.

    Availability and pricing

    GPT‑5.4 is rolling out gradually today across ChatGPT and Codex. In the API, GPT‑5.4 is available now as gpt-5.4. GPT‑5.4 Pro is also available in the API as gpt-5.4-pro for developers who need maximum performance on the most complex tasks.

    In ChatGPT, GPT‑5.4 Thinking is available starting today to ChatGPT Plus, Team, and Pro users, replacing GPT‑5.2 Thinking. GPT‑5.2 Thinking will remain available for three months for paid users in the model picker under the Legacy Models section, after which it will be retired on June 5, 2026. Those on Enterprise and Edu plans can enable early access via admin settings. GPT‑5.4 Pro is available to Pro and Enterprise plans. Context windows⁠ (opens in a new window) in ChatGPT for GPT‑5.4 Thinking remain unchanged from GPT‑5.2 Thinking.

    GPT‑5.4 is our first mainline reasoning model that incorporates the frontier coding capabilities of GPT‑5.3‑codex and that is rolling out across ChatGPT, the API and Codex. We're calling it GPT‑5.4 to reflect that jump, and to simplify the choice between models when using Codex. Over time, you can expect our Instant models and Thinking models to evolve at different speeds.

    GPT‑5.4 in Codex includes experimental support for the 1M context window. Developers can try this by configuring model_context_window and model_auto_compact_token_limit. Requests that exceed the standard 272K context window count against usage limits at 2x the normal rate.

    In the API, GPT‑5.4 is priced higher per token than GPT‑5.2 to reflect its improved capabilities, while its greater token efficiency helps reduce the total number of tokens required for many tasks. Batch and Flex pricing are available at half the standard API rate, while Priority processing is available at twice the standard API rate.

    Original source Report a problem
  • Mar 3, 2026
    • Date parsed from source:
      Mar 3, 2026
    • First seen by Releasebot:
      Mar 3, 2026
    OpenAI logo

    OpenAI

    GPT‑5.3 Instant: Smoother, more useful everyday conversations

    ChatGPT debuts GPT‑5.3 Instant, delivering more accurate web‑sourced answers, smoother tone, and fewer refusals for everyday chats. It balances web findings with known info for faster, relevance‑driven responses and stronger writing. Available today in ChatGPT and API; 5.2 remains for 3 months.

    GPT‑5.3 Instant Release Notes

    Today, we’re releasing an update to ChatGPT’s most-used model that makes everyday conversations more consistently helpful and fluid. GPT‑5.3 Instant delivers more accurate answers, richer and better-contextualized results when searching the web, and reduces unnecessary dead ends, caveats, and overly declarative phrasing that can interrupt the flow of conversation.

    This update focuses on the parts of the ChatGPT experience people feel every day: tone, relevance, and conversational flow. These are nuanced problems that don’t always show up in benchmarks, but shape whether ChatGPT feels helpful or frustrating. GPT‑5.3 Instant directly reflects user feedback in these areas.

    Better judgment around refusals and fewer disclaimers

    We heard feedback that GPT‑5.2 Instant would sometimes refuse questions it should be able to answer safely, or respond in ways that feel overly cautious or preachy, particularly around sensitive topics.

    GPT‑5.3 Instant significantly reduces unnecessary refusals, while toning down overly defensive or moralizing preambles before answering the question. When a useful answer is appropriate, the model should now provide one directly, staying focused on your question without unnecessary caveats. In practice, this means fewer dead ends and more directly helpful answers.

    More useful, well-synthesized answers when using the web

    GPT‑5.3 Instant also improves the quality of answers when information comes from the web. It more effectively balances what it finds online with its own knowledge and reasoning—for example, using its existing understanding to contextualize recent news rather than simply summarizing search results.

    More broadly, GPT‑5.3 Instant is less likely to overindex on web results, which previously could lead to long lists of links or loosely connected information. It does a stronger job of recognizing the subtext of questions and surfacing the most important information, especially upfront, resulting in answers that are more relevant and immediately usable, without sacrificing speed or tone.

    A smoother, more to-the-point conversational style

    GPT‑5.2 Instant’s tone could sometimes feel “cringe,” coming across as overbearing or making unwarranted assumptions about user intent or emotions.

    This update has a more focused yet natural conversational style, cutting back on unnecessary proclamations and phrases like “Stop. Take a breath.” We’re also working to keep ChatGPT’s personality more consistent across conversations and updates, so improvements feel like upgrades in capability while preserving a familiar and stable experience.

    As always, you can adjust the model’s response tone, like its warmth and enthusiasm, within settings.

    More reliably accurate responses

    GPT‑5.3 Instant delivers more factual responses than previous models, with reduced hallucinations across a wide range of topics. To measure accuracy, we used two internal evaluations: one focused on higher-stakes domains such as medicine, law, and finance, and another measuring hallucination rates on de-identified ChatGPT conversations that users flagged as factual errors—cases that tend to be especially hallucination-prone.

    On the higher-stakes evaluation, GPT‑5.3 Instant reduces hallucination rates by 26.8% when using the web and 19.7% when relying only on its internal knowledge, compared to prior models. On the user-feedback evaluation, hallucinations decrease by 22.5% with web use and 9.6% without web access.

    Stronger writing, with more range and texture

    GPT‑5.3 Instant is also a stronger writing partner. It’s better at helping you write resonant, imaginative, and immersive prose, whether you’re drafting fiction, refining a passage, or exploring new ideas. These changes help the model move more fluidly between practical tasks and expressive writing without losing clarity or coherence.

    Limitations

    While GPT‑5.3 Instant makes meaningful progress on everyday usability, there’s more work ahead:

    • Non-English languages: The response style of ChatGPT in some languages—such as Japanese and Korean—can sound stilted or overly literal. Improving tone and naturalness across languages remains an ongoing focus.
    • Tone: While GPT‑5.3 Instant’s response tone should feel smoother, we’re continuing to monitor feedback and improve while expanding customization options.

    Availability

    GPT‑5.3 Instant is available starting today to all users in ChatGPT, as well as to developers in the API as ‘gpt-5.3-chat-latest.’ Updates to Thinking and Pro will follow soon. GPT‑5.2 Instant will remain available for three months for paid users in the model picker under the Legacy Models section, after which it will be retired on June 3, 2026.

    We did comprehensive safety training and evaluations for GPT‑5.3 Instant and detail that work in our system card.

    Original source Report a problem
  • Feb 13, 2026
    • Date parsed from source:
      Feb 13, 2026
    • First seen by Releasebot:
      Feb 14, 2026
    OpenAI logo

    OpenAI

    Introducing Lockdown Mode and Elevated Risk labels in ChatGPT | OpenAI

    OpenAI rolls out Lockdown Mode for high-security users and introduces Elevated Risk labels across ChatGPT, Atlas, and Codex to flag features with higher risk. These protections curb data exfiltration, boost admin oversight, and plan a consumer rollout in coming months.

    Two new protections designed to help users and organizations mitigate prompt injection attacks

    • Lockdown Mode in ChatGPT, an advanced, optional security setting for higher-risk users
    • “Elevated Risk” labels for certain capabilities in ChatGPT, ChatGPT Atlas, and Codex that may introduce additional risk

    These additions build on our existing protections across the model, product, and system levels. This includes sandboxing, protections against URL-based data exfiltration, monitoring and enforcement, and enterprise controls like role-based access and audit logs.

    Helping organizations protect employees most at-risk of cyberattacks

    Lockdown Mode is an optional, advanced security setting designed for a small set of highly security-conscious users—such as executives or security teams at prominent organizations—who require increased protection against advanced threats. It is not necessary for most users. Lockdown Mode tightly constrains how ChatGPT can interact with external systems to reduce the risk of prompt injection–based data exfiltration.

    Lockdown Mode deterministically disables certain tools and capabilities in ChatGPT that an adversary could attempt to exploit to exfiltrate sensitive data from users’ conversations or connected apps via attacks such as prompt injections.

    For example, web browsing in Lockdown Mode is limited to cached content, so no live network requests leave OpenAI’s controlled network. This restriction is designed to prevent sensitive data from being exfiltrated to an attacker through browsing. Some features are disabled entirely when we can’t provide strong deterministic guarantees of data safety.

    Lockdown Mode is a new deterministic setting that helps guard data from being inadvertently shared with third parties by tightly constraining how ChatGPT can interact with certain external systems.

    ChatGPT business plans already provide enterprise-grade data security. Lockdown Mode builds on those protections and is available for ChatGPT Enterprise, ChatGPT Edu, ChatGPT for Healthcare, and ChatGPT for Teachers. Admins can enable it in Workspace Settings (opens in a new window) by creating a new role (opens in a new window). When enabled, Lockdown Mode layers additional restrictions on top of existing admin settings.

    Learn more about Lockdown Mode in our Help Center (opens in a new window).

    Because some critical workflows rely on apps, Workspace Admins retain more granular controls. They can choose exactly which apps—and which specific actions within those apps—are available to users in Lockdown Mode. Additionally, and separate from Lockdown Mode, the Compliance API Logs Platform (opens in a new window) provides detailed visibility into app usage, shared data, and connected sources, helping admins maintain oversight.

    We plan to make Lockdown Mode available to consumers in the coming months.

    Helping users make informed choices about risk

    AI products can be more helpful when connected to your apps and the web, and we’ve invested heavily in keeping connected data secure. At the same time, some network-related capabilities introduce new risks that aren’t yet fully addressed by the industry’s safety and security mitigations. Some users may be comfortable taking on these risks, and we believe it’s important for users to have the ability to decide whether and how to use them, especially while working with their private data.

    Our approach has been to provide in-product guidance for features that may introduce additional risk. To make this clearer and more consistent, we’re standardizing how we label a short list of existing capabilities. These features will now use a consistent “Elevated Risk” label across ChatGPT, ChatGPT Atlas, and Codex, so users receive the same guidance wherever they encounter them.

    For example, in Codex, our coding assistant, developers can grant Codex network access so it can take actions on the web like looking up documentation. The relevant settings screen includes the “Elevated Risk” label, along with a clear explanation of what changes, what risks may be introduced, and when that access is appropriate.

    A screenshot of the Codex settings screen where users can configure what network access Codex has.

    What’s next

    We continue to invest in strengthening our safety and security safeguards, especially for novel, emerging, or growing risks. As we strengthen the safeguards for these features, we will remove the “Elevated Risk” label once we determine that security advances have sufficiently mitigated those risks for general use. We will also continue to update which features carry this label over time to best communicate risk to users.

    Original source Report a problem
  • Feb 12, 2026
    • Date parsed from source:
      Feb 12, 2026
    • First seen by Releasebot:
      Feb 12, 2026
    OpenAI logo

    OpenAI

    Introducing GPT‑5.3‑Codex‑Spark | OpenAI

    OpenAI unveils GPT‑5.3‑Codex‑Spark, a real‑time coding model optimized for ultra‑low latency on Cerebras hardware. Rolling out as a research preview for ChatGPT Pro with a 128k context, faster response pipelines, and real‑time editing capabilities for developers.

    Today, we’re releasing a research preview of GPT‑5.3-Codex-Spark, a smaller version of GPT‑5.3-Codex, and our first model designed for real-time coding. Codex-Spark marks the first milestone in our partnership with Cerebras, which we announced in January⁠. Codex-Spark is optimized to feel near-instant when served on ultra-low latency hardware—delivering more than 1000 tokens per second while remaining highly capable for real-world coding tasks.

    We’re sharing Codex-Spark on Cerebras as a research preview to ChatGPT Pro users so that developers can start experimenting early while we work with Cerebras to ramp up datacenter capacity, harden the end-to-end user experience, and deploy our larger frontier models.

    Our latest frontier models have shown particular strengths in their ability to do long-running tasks, working autonomously for hours, days or weeks without intervention. Codex-Spark is our first model designed specifically for working with Codex in real-time—making targeted edits, reshaping logic, or refining interfaces and seeing results immediately. With Codex-Spark, Codex now supports both long-running, ambitious tasks and getting work done in the moment. We hope to learn from how developers use it and incorporate feedback as we continue to expand access.

    At launch, Codex-Spark has a 128k context window and is text-only. During the research preview, Codex-Spark will have its own rate limits and usage will not count towards standard rate limits. However, when demand is high, you may see limited access or temporary queuing as we balance reliability across users.

    Speed and intelligence

    Codex-Spark is optimized for interactive work where latency matters as much as intelligence. You can collaborate with the model in real time, interrupting or redirecting it as it works, and rapidly iterate with near-instant responses. Because it’s tuned for speed, Codex-Spark keeps its default working style lightweight: it makes minimal, targeted edits and doesn’t automatically run tests unless you ask it to.

    Coding

    Codex-Spark is a highly capable small model optimized for fast inference. On SWE-Bench Pro and Terminal-Bench 2.0, two benchmarks evaluating agentic software engineering capability, GPT‑5.3-Codex-Spark demonstrates strong performance while accomplishing the tasks in a fraction of the time compared to GPT‑5.3-Codex.

    Latency improvements for all models

    As we trained Codex-Spark, it became apparent that model speed was just part of the equation for real-time collaboration—we also needed to reduce latency across the full request-response pipeline. We implemented end-to-end latency improvements in our harness that will benefit all models. Under the hood, we streamlined how responses stream from client to server and back, rewrote key pieces of our inference stack, and reworked how sessions are initialized so that the first visible token appears sooner and Codex stays responsive as you iterate. Through the introduction of a persistent WebSocket connection and targeted optimizations inside of Responses API, we reduced overhead per client/server roundtrip by 80%, per-token overhead by 30%, and time-to-first-token by 50%. The WebSocket path is enabled for Codex-Spark by default and will become the default for all models soon.

    Powered by Cerebras

    Codex-Spark runs on Cerebras’ Wafer Scale Engine 3⁠ (opens in a new window)—a purpose-built AI accelerator for high-speed inference giving Codex a latency-first serving tier. We partnered with Cerebras to add this low-latency path to the same production serving stack as the rest of our fleet, so it works seamlessly across Codex and sets us up to support future models.

    “What excites us most about GPT-5.3-Codex-Spark is partnering with OpenAI and the developer community to discover what fast inference makes possible—new interaction patterns, new use cases, and a fundamentally different model experience. This preview is just the beginning.”

    — Sean Lie, CTO and Co-Founder of Cerebras

    GPUs remain foundational across our training and inference pipelines and deliver the most cost effective tokens for broad usage. Cerebras complements that foundation by excelling at workflows that demand extremely low latency, tightening the end-to-end loop so Codex feels more responsive as you iterate. GPUs and Cerebras can be combined for single workloads to reach the best performance.

    Availability & details

    Codex-Spark is rolling out today as a research preview for ChatGPT Pro users in the latest versions of the Codex app, CLI, and VS Code extension. Because it runs on specialized low-latency hardware, usage is governed by a separate rate limit that may adjust based on demand during the research preview. In addition, we are making Codex-Spark available in the API for a small set of design partners to understand how developers want to integrate Codex-Spark into their products. We’ll expand access over the coming weeks as we continue tuning our integration under real workloads.

    Codex-Spark is currently text-only at a 128k context window and is the first in a family of ultra-fast models. As we learn more with the developer community about where fast models shine for coding, we’ll introduce even more capabilities–including larger models, longer context lengths, and multimodal input.

    Codex-Spark includes the same safety training as our mainline models, including cyber-relevant training. We evaluated Codex-Spark as part of our standard deployment process, which includes baseline evaluations for cyber and other capabilities, and determined that it does not have a plausible chance of reaching our Preparedness Framework threshold for high capability in cybersecurity or biology.

    What’s next

    Codex-Spark is the first step toward a Codex with two complementary modes: longer-horizon reasoning and execution, and real-time collaboration for rapid iteration. Over time, the modes will blend—Codex can keep you in a tight interactive loop while delegating longer-running work to sub-agents in the background, or fanning out tasks to many models in parallel when you want breadth and speed, so you don’t have to choose a single mode up front.

    As models become more capable, interaction speed becomes a clear bottleneck. Ultra-fast inference tightens that loop, making Codex feel more natural to use and expanding what’s possible for anyone turning an idea into working software.

    Original source Report a problem

Related products