Feature Flagging and Experimentation Release Notes

Release notes for feature flagging, A/B testing and experimentation platforms

Get this feed:

Products (17)

Latest Feature Flagging and Experimentation Updates

  • Jun 2, 2026
    • Date parsed from source:
      Jun 2, 2026
    • First seen by Releasebot:
      Jun 2, 2026
    Statsig logo

    Statsig

    🎰 Create Autotunes via the Statsig MCP

    Statsig adds direct Autotune experiment creation through Statsig MCP, letting teams define arms, success events, windows, and winner thresholds without the console. Autotunes are created as drafts by default, with confirmation built in before anything is written.

    Now, you can create an Autotune (multi-armed bandit) experiment directly through the Statsig MCP, no console required.

    What you can do now

    • Create an Autotune by describing the arms, success event, exploration and attribution windows, and winner threshold.
    • Autotunes are created as drafts by default, so no traffic is reallocated until you start it from the console.
    • The agent confirms before writing, prompting for confirmation before anything is created.

    One new tool is available on api.statsig.com/v1/mcp:

    • Create_autotune

    Why this matters

    Teams running Autotune for live AI agent experiments can now manage the full setup through agents. Instead of manually configuring a multi-armed bandit in the console, you can describe what you want and let the agent build it, keeping your agentic workflows end-to-end.

    Try it out

    If you have the Statsig MCP set up, try a prompt like:

    • "Using the Statsig MCP, create an Autotune called checkout-button-color with a control variant {color: blue} and a treatment variant {color: green}, optimizing for the event 'checkout', with 24hr exploration and attribution windows and a 95% winner threshold."

    Learn more in the docs: Statsig MCP Overview

    Original source
  • Jun 2, 2026
    • Date parsed from source:
      Jun 2, 2026
    • First seen by Releasebot:
      Jun 2, 2026
    Statsig logo

    Statsig

    🗑️ Single-Override DELETE Endpoints in the Console API

    Statsig adds direct Console API support for deleting individual experiment and layer overrides, including conditional and userID overrides. The new idempotent DELETE endpoints simplify cleanup, support optional environment targeting, and work cleanly with OpenAPI-generated and SDK-based clients.

    You can now delete individual experiment and layer overrides directly via the Console API, no GET-mutate-POST workaround needed.

    What you can do now

    • Delete a single conditional or userID override from an experiment or layer via path-param DELETEs.
    • Target a specific environment with an optional environment query param, or omit it for the all-environments bucket.
    • Call these endpoints cleanly from OpenAPI-generated or SDK-based clients, since they use path and query params only with no DELETE body.

    Four new endpoints are available on statsigapi.net/console/v1/ :

    • DELETE /experiments/:id/overrides/conditional/:type/:name
    • DELETE /experiments/:id/overrides/userID/:userID
    • DELETE /layers/:id/overrides/conditional/:type/:name
    • DELETE /layers/:id/overrides/userID/:userID

    :type is gate or segment.

    :name is the gate or segment name. All four endpoints are idempotent, returning 200 even when no matching override exists.

    Why this matters

    SDK-based E2E testing frameworks often need to clean up individual overrides between test runs. The previous approach required fetching the full overrides object, mutating it locally, and re-posting it, which is fragile and hard to parallelize. These endpoints make override teardown a single, safe, idempotent call.

    Try it out

    Review the full API reference in the Statsig Console API docs.

    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from Statsig and hundreds of other software products.

    Create account
  • Jun 1, 2026
    • Date parsed from source:
      Jun 1, 2026
    • First seen by Releasebot:
      Jun 2, 2026
    Statsig logo

    Statsig

    🕰️ Version History Tools in the Statsig MCP

    Statsig adds MCP tools for full version history on feature gates, experiments, and dynamic configs, exposing who changed what and when through api.statsig.com/v1/mcp for debugging, incident reviews, and agentic workflows.

    Statsig MCP now lets you pull the full edit history of any Feature Gate, Experiment, or Dynamic Config without a console.

    What you can do now

    • Retrieve the complete version timeline for a feature gate, experiment, or dynamic config.
    • See who made each change, when, and exactly what was modified: rules, ID type, enabled state, allocation, variants, and values.
    • Access config history programmatically to power agentic workflows that reason about how configs have changed over time.

    Three new tools are available on api.statsig.com/v1/mcp:

    • Get_Gate_Version_History
    • Get_Experiment_Version_History
    • Get_Dynamic_Config_Version_History

    Why this matters

    Config history is critical for debugging, incident reviews, and agentic reasoning, but it's been locked behind manual console navigation. Now, you can understand when behavior changed, reconstruct a timeline for an incident post-mortem or feed agents that need to detect or reason about config drift over time.

    Try it out

    If you have the Statsig MCP, try a prompt like:

    • "Using the Statsig MCP, pull the version history for gate feature-gate-name and summarize what changed across versions."

    Learn more in the docs: Statsig MCP Overview.

    Original source
  • Jun 1, 2026
    • Date parsed from source:
      Jun 1, 2026
    • First seen by Releasebot:
      Jun 1, 2026
    Convert.com logo

    Convert.com

    Monthly Release Notes - May 2026

    Convert.com releases a PHP SDK for fullstack experimentation, expands Heatmaps and Signals controls, adds Revenue per Visitor to the experience overview, and introduces a new scheduled end date action for experiences. It also delivers major MCP updates and ongoing tracking script performance improvements.

    PHP SDK Now Available for Fullstack Experimentation

    We released the PHP SDK, making it easier for teams using PHP as their primary stack to adopt Convert for fullstack testing.

    This gives PHP-based teams a more comfortable and native path to server-side experimentation and personalization with Convert. More info here.

    Major MCP Update Expands Platform Capabilities

    This month’s MCP update significantly expands platform capabilities.

    The release includes regenerated specs and tools supporting 16 top-level MCP tools, 113 callable actions, new backend operations, and 19 MCP-native workflow prompts. It also improves search and retrieval with OpenAI-compatible search and fetch, a refreshed knowledge base, and a rebuilt semantic hybrid vector search. In addition, the update strengthens reliability through schema and data correctness fixes, stronger guardrails, resilient uploads and retries, expanded test coverage, drift detection, and improved MCP protocol compliance.

    Overall, this release makes the platform more powerful, more reliable, and better equipped to support advanced workflows. More info here.

    Heatmaps Released for Easier Visualization

    We released Heatmaps, giving users an easier way to visualize areas of concentration and interaction.

    This makes it simpler to understand where visitors focus their attention and helps teams uncover behavior patterns more quickly during analysis. More info here.

    Signals Now Includes Sampling Rate Control

    We added sampling rate exposure for Signals, giving users more control over the percentage of recordings they receive.

    This improvement makes it easier to fine-tune session capture based on analysis needs, volume preferences, and project scope. More info here.

    Revenue per Visitor Added to the Experience Overview Table

    While Revenue per Visitor (RPV) is not a new metric, it is now available directly in the experience overview table.

    Users can now toggle it in the same way as Conversions per Visitor, making it easier to review primary-goal revenue performance at a glance from the overview screen.

    New Scheduled End Date Action for Experiences

    We added a new option to the end date settings that lets users choose what should happen when an experience reaches its scheduled end date.

    Previously, reaching the end date would complete the experience by default. Now, users can choose whether the experience should be completed or paused, offering more flexibility in how scheduled experiments are managed.

    Ongoing Tracking Script Performance Improvements

    We also continued improving the efficiency of our tracking script so it loads faster and has less impact on website performance.

    While this may not always be directly visible, it reflects our continued investment in optimization and supports our broader Monthly Performance Improvements initiative in the in-app dashboard.

    Original source
  • Jun 1, 2026
    • Date parsed from source:
      Jun 1, 2026
    • First seen by Releasebot:
      Jun 1, 2026
    Flagsmith logo

    Flagsmith

    v2.238.0

    Flagsmith releases experimentation and UX improvements, adds unique event names from ClickHouse and backend type filtering for multivariate features, and tightens MCP support. It also fixes feature versioning, GitLab link status, and webhook SSRF issues while updating dependencies and docs.

    2.238.0 (2026-06-01)

    Features

    • experimentation: add unique event names from ClickHouse (#7660) (1ee9b7b)
    • experiments ux improvements (#7644) (755df22)
    • experiment: use backend type filter for multivariate features (#7630) (164d4bc)
    • MCP: Use native OpenAPI tool fields and consolidate private deps (#7656) (2c8cf0f)

    Bug Fixes

    • feature_segment missing in versioning (#7618) (f5584c9)
    • GitLab: issue/MR status in the feature Links panel goes stale after state changes (#7545) (77f742e)
    • webhooks: Prevent SSRF in webhooks and webhook tests (#7550) (85b92fa)

    Dependency Updates

    • node: upgrade ws transitive dependency to fix CVE-2026-45736 (#7634) (4de821d)

    CI

    • renovate: Fix renovate config json & add linter to pre-commit hooks (#7657) (69c6dbf)
    • Replace Dependabot with Renovate (#7645) (5b4dff0)

    Docs

    • add vulnerability response policy to support page (#7423) (d39aae7)
    • Consolidate PR collaboration guide to Flagsmith/AGENTS.md (#7646) (c1f40d8)
    • CVE vulnerability guidance (#7655) (89135fe)
    Original source
  • May 29, 2026
    • Date parsed from source:
      May 29, 2026
    • First seen by Releasebot:
      May 29, 2026
    Flagsmith logo

    Flagsmith

    v2.237.0

    Flagsmith adds new experimentation capabilities, including experiment CRUD endpoints, a creation wizard, a list page with filtering and pagination, search and status counts, plus pagination and deletion guard improvements. It also fixes project-scoped custom fields for admins and updates dependencies and docs.

    2.237.0 (2026-05-29)

    Features

    • added pagination and deletion guard (#7615) (588fa17)
    • added search query to get experiment endpoints (#7617) (91efcff)
    • experiment: add experiments list page with filtering, pagination, and actions (#7628) (7ea1158)
    • experimentation: add Experiment base model and CRUD endpoints (#7591) (d118da7)
    • experimentation: add experiment creation wizard frontend (#7596) (cdda679)
    • experimentation: return feature object along with experiment entity (#7609) (e5f2f7c)
    • return experiment status counts along paginated list (#7625) (aa85699)
    • track custom event feature creation (#7603) (cac20ff)

    Bug Fixes

    • allow project admins to create and manage project-scoped custom fields (#7518) (53b93ba)

    Dependency Updates

    • bump flagsmith-sql-flag-engine to 0.1.1 (#7616) (482f0ff)
    • Use uv supported by dependabot (#7633) (d844e1a)

    Docs

    • Sizing: rewrite as workload-driven guide (#7592) (b34ef1f)
    Original source
  • May 27, 2026
    • Date parsed from source:
      May 27, 2026
    • First seen by Releasebot:
      May 27, 2026
    Flagsmith logo

    Flagsmith

    v2.236.0

    Flagsmith adds a type filter on the get feature endpoint and ships several bug fixes for experimentation and segment membership.

    2.236.0 (2026-05-27)

    Features

    • added type filter on get feature endpoint (#7598) (18230bf)

    Bug Fixes

    • code references count returns 0 after unchanged rescan (#7599) (3fa3d9a)
    • experimentation: always insert new WarehouseConnection on create (#7605) (38e4a09)
    • Segment membership: Read counts off segment.membership_counts (#7601) (b000f47)
    • Segment membership: Zero out (segment, env) pairs that stopped matching (#7600) (2a13539)
    Original source
  • May 27, 2026
    • Date parsed from source:
      May 27, 2026
    • First seen by Releasebot:
      May 27, 2026
    Growthbook logo

    Growthbook

    v4.4.0

    Growthbook releases its biggest update yet, adding Cmd+K universal search, Product Analytics explorers, AI Data Analyst beta, multi-environment feature rules, revamped approval flows, feature flag ramp schedules, and a major REST API expansion.

    Highlights

    This is our biggest release yet! Over 400 PRs closed.

    • Cmd+K command pallets for universal search
    • Product Analytics explorers and AI Data Analyst (beta)
    • Multi-environment feature rules
    • Overhaul of feature approval flows and revisions
    • Feature flag ramp schedules
    • Bandits without sticky bucketing
    • Pre-Exposure bias checks
    • Huge REST API refactor with tons of new endpoints for metric groups, teams, experiments, features, namespaces, and more

    Other Changes

    • Holdout scheduling
    • Feature revision comparison tool
    • Attribute/Identifier type mapping support
    • Lookback override option for experiment analysis
    • Improved warehouse metadata and tagging for SQL query cost attribution
    • Improved stale feature algorithm and UX
    • Support for BigQuery reservations
    • API keys with arbitrary roles (not just admin or readonly)
    • New Saved Group approval flow
    • Option to include additional metadata in SDK payloads
    • Support for incremental refresh of quantile metrics using KLL sketches
    • Ability to disable API keys and track last usage date
    • Huge SQL generation refactor (internal, no user-facing changes)
    • Better filtering on Insight dashboards
    • Namespaces overhaul
    • Updated Presentations tool
    • Plus tons of UX improvements, bug fixes, dependency updates, security patches, and performance improvements.

    Thanks to all of the existing contributors: @Auz, @Kevin-Chant, @ahdriel, @bryce-fitzsimons, @fsarachu, @gazzdingo, @itsgrimetime, @jdorn, @jrnold, @lukebrawleysmith, @lukesonnet, @madhuchavva, @mknowlton89, @msamper, @natasha-growthbook, @nhat-growthbook, @nodirnasirov, @oelshaikh, @royalfig, @tzjames and a big thanks to all of the first time contributors:

    • @estrattonbailey made their first contribution in #5283
    • @teresayung made their first contribution in #5251
    • @nadapzy made their first contribution in #5396
    • @dannylin-ant made their first contribution in #5504
    • @HampusPoppius made their first contribution in #5572
    • @saurabhkashyap-ui made their first contribution in #5564
    • @olu-an made their first contribution in #5604
    • @csbailey5t made their first contribution in #5614
    • @discorev made their first contribution in #5600
    • @aiSynergy37 made their first contribution in #5639
    • @lillialjackson made their first contribution in #5721
    • @nielskaspers made their first contribution in #5629
    • @johnham-ant made their first contribution in #5793
    • @adittya-upstart made their first contribution in #5778
    • @jakemainwaring22 made their first contribution in #5698
    • @anna-yn made their first contribution in #5920
    Original source
  • May 27, 2026
    • Date parsed from source:
      May 27, 2026
    • First seen by Releasebot:
      May 27, 2026
    Growthbook logo

    Growthbook

    GrowthBook 4.4: Safe and standardized feature flag management at scale

    Growthbook releases 4.4 with automated release plans and ramp schedules, configurable approval workflows, stronger stale flag detection, and expanded REST API and MCP server support. It also adds SDK cache improvements, SDK payload metadata, and a namespace overhaul for safer AI-era feature flagging.

    The safety net for rapid AI development

    AI has empowered developers to build new features faster than ever. What used to take a week can now be shipped in a day. The agentic era has transformed the way engineers work, but shipping fast without the proper guardrails can quickly lead to incidents. GrowthBook Feature Flags set your whole team up for the pace and volume of today’s development lifecycle.

    Feature flags allow modern teams to ship fast while maintaining control. The best practice is straightforward: wrap every feature behind a flag so you can ship at scale and roll back the moment something breaks. But as teams scale and AI agents become more deeply embedded in workflows, ad-hoc flag management breaks down.

    Without the right metrics monitored on every release, something that breaks in production could go undetected for weeks or even months before you realize the problem. By that point, dozens of other features may have shipped, making it difficult to identify which change is actually the culprit. Smaller failures compound the problem. Someone forgets to increment a rollout. A flag that should have been deleted six months ago is still sitting in production. Each rollout follows a different process depending on who is running it.

    Consistency and guardrails are what enable teams to safely keep pace with the compressed development cycles that come with AI coding. Feature flags should be baked into the agent's coding workflow for every new feature, creating a safety net where guardrail metrics monitor performance and auto roll back if something degrades.

    With that net in place, teams can move quickly, confident that bad releases will get caught early and rolled back. Humans can then focus on where judgment is actually needed, such as deciding which guardrail metrics matter, reviewing whether the rollout plan aligns to the risk, and approving changes that warrant human review.

    Modern teams need enterprise-class flag management with the controls and governance to ship safely at AI speed. That’s where GrowthBook comes in.

    GrowthBook 4.4 extends our feature flag platform with 3 major new capabilities: release plans with automated ramp schedules, configurable approval workflows, and enhanced stale feature flag detection through expanded REST API and MCP server endpoints. 4.4 also includes SDK cache improvements, metadata in SDK payloads, a namespace overhaul, and more. Together, these controls turn feature flag management into a repeatable, scalable practice that lets you move quickly while de-risking every release.

    Release plans with ramp schedules: Standardize and automate your rollout

    In 4.4, we’re introducing release plans with ramp schedules: automated, staged rollout plans attached to a feature flag. Release plans make it fast and easy to define a standardized schedule and rollout process into a reusable template that everyone on your team can follow, with guardrails built in so safety is a standard part of how features are released.

    You define the stages, set the percentages, time intervals, and guardrails, and GrowthBook executes the plan automatically. Choose from preset templates or build your own to target specific user groups and attributes. You can also gate individual stages of your release plan by prerequisite features or by the specific feature value a user is currently assigned.

    With manual feature rollouts, you can run into two types of problems. Either someone forgets to increment the percentage, and a feature sits at a given stage indefinitely. Or someone moves too fast, and a problem that should have been caught at 10% instead hits 50% of users. Release plans keep rollouts from stalling at an early stage or accelerating past the point where a problem could have been caught. Build in approval requirements at specific stages based on your risk tolerance, requiring the rollout to pause for review and manual approval before advancing. You can also attach guardrail metrics so the feature auto-rolls back if any of them degrade, catching problems automatically between approval gates.

    Best practices for designing a release plan

    Guardrails help your team feel confident about shipping, and combining them with human approval gates ensures there are no gaps.

    1. Start simple and don't over-engineer your first release plan.
    2. Build out your process as you go, learning what works and what doesn't with each release.
    3. Choose guardrail metrics your team is aligned on
    4. Pair guardrail metrics with approval gates where it makes sense.

    Once you have a process that works, standardize it as a default template with versions for different risk profiles or product needs (high risk, low risk, internal-only, etc.) Treat these templates as living artifacts and evolve them as your team learns what works and what areas need improvement.

    Sample release plans

    Rollout processes vary by team, product, and risk tolerance. Release plans are flexible enough to fit whatever process your team uses and ensure every rollout follows that process consistently, no matter who is running it. Below are two of the most common patterns we see, and how to structure a release plan for each.

    Example 1: Simple percentage rollout

    A simple percentage rollout is where you expose a small percentage of users before gradually going wider. Set approval gates at key checkpoints to enforce a metrics review and manual approval before committing to broader rollout.

    Example 2: Segmented rollout

    A segmented rollout lets you control who gets access first, reducing risk to your highest-value users. For instance, you may opt to roll out a feature to free users before paid. Free users are valuable, but the risk of churn or revenue impact is lower than with paying customers. By the time you're ramping up paid users, you've already caught the obvious issues.

    You may segment by any attribute available in GrowthBook, including location, device type, browser, user group, and more. Start simple or build in as much complexity as you need. You can also apply guardrails to specific stages; for example, you might be okay rolling the feature out to free users, but want it to auto-rollback if metrics degrade when it hits paid users.

    The release plan with ramp schedules feature is available in the GrowthBook Pro and Enterprise plans.

    Flag revisions, approvals, audit logs: maintain control and visibility

    Feature flag revisions

    When someone changes a feature flag, a draft revision can be submitted for review to approve or request changes. On approval, the change is published to the SDK. Nothing goes out unreviewed unless you want it to.

    Flag revisions provide a complete audit trail, capturing who changed a flag, what they changed, and when they changed it. When something breaks, you get instant visibility into what recently changed. As multiple people manage flags over time, revisions also preserve the intent and context behind each change.

    GrowthBook maintains a version history of all previous revisions of a feature flag over time, so you always maintain a clear picture of how a flag has evolved over time.

    The revision feature is available for feature flags in all GrowthBook plans.

    Feature flag approval workflows

    Approval requirements are configurable per project and per environment, giving you the flexibility to require review gates where they make sense. For example, you can require approvals for production while letting staging updates flow freely.

    GrowthBook 4.4 expands the scope of approval requirements. Previously, approval workflows applied only to changes to rules and values. Now, you can also require approvals for environment kill switches, pre-requisites, saved groups, and metadata changes. This gives more granular control over what can happen without a review.

    These approval workflows help teams move fast with the right safety checks at the right time, making governance tunable to your team’s needs. This level of control is especially important when agents are acting on your behalf. Agents can create drafts for all these change types and require human approval. The separation is clear and enforced, so your governance is applied to the full scope of what an agent can touch.

    The feature flag approval workflow feature is available in the GrowthBook Enterprise plan.

    Audit logs

    For teams with compliance requirements, the audit log provides the paper trail without any extra process. They provide a timestamped record of every feature flag event, including changes, approvals, publishes, creation, and more.

    You may also expand for a more detailed view of the specific changes made for each event.

    The audit log feature is available in all GrowthBook plans.

    Stale feature flag cleanup with agents

    GrowthBook categorizes a feature flag as stale if it hasn't been updated in 2 weeks and is not active in any environment, or if there is a one-sided rule that sends 100% of traffic to a single variation.

    GrowthBook 4.4 also introduces support for teams using AI coding tools like Claude Code or Cursor to detect and remove stale flags that have outlived their purpose. Using GrowthBook's MCP server or REST API, you can prompt an agent to surface every stale flag in your environment, returning it as a reviewable table with additional context on why the flag is being surfaced as stale. Once you've decided which flags to remove, write a follow-up prompt asking the agent to locate and remove those flag references from your codebase to eliminate technical debt in one pass.

    The same endpoints can also surface ambiguous flags that don't definitively meet the stale definition, but show signs they may no longer be in use. Examples include: flags with no rules defined, abandoned drafts, or disabled environments. Results come back as a table you can review to decide which flags genuinely need cleanup and which are still doing real work.

    Stale flags create real risk: technical debt, performance overhead, and accidental production changes. Stale and ambiguous feature flag detection together help simplify flag hygiene by surfacing the flags worth reviewing, with enough context to tell what's truly orphaned and what's still needed in production.

    The stale feature flag cleanup feature is available in all GrowthBook plans.

    REST API and MCP server endpoints

    In 4.4, we expanded our REST API and MCP server endpoints across the feature flagging surface. Anything your team can do in the GrowthBook app, AI agents can now do through the API or MCP server: create flags, manage revisions, configure release plans, set targeting rules, locate stale flags, and more.

    The same permissions, approval workflows, and audit logs apply to every API- or MCP-prompted action. Whether a human triggers a change from the UI or an agent runs it from your editor, it goes through the same review gate and shows up in the same audit log. Agents get the same platform and the same accountability as your team, with the scoping and guardrails calibrated for how they work.

    The REST API & MCP Server endpoints are available in all GrowthBook plans.

    Built for the way modern teams ship

    The pace of development has fundamentally changed with AI, and the controls around how teams ship need to evolve with it. Teams need to move fast, test everything, ship safely, and roll back at the first sign of trouble.

    GrowthBook 4.4 gives engineering and product teams the building blocks to do exactly that: a repeatable rollout process accessible through the app, REST API, or the MCP server. Whether the change comes from an engineer, a product manager, or an AI agent, the process holds.

    Original source
  • May 27, 2026
    • Date parsed from source:
      May 27, 2026
    • First seen by Releasebot:
      May 27, 2026
    Growthbook logo

    Growthbook

    GrowthBook 4.4: Product development at AI speed

    Growthbook releases 4.4 with a rebuilt API for programmatic workflows, a beta AI Data Analyst for self-serve product analytics, and safer feature flag rollouts with ramp schedules, approvals, and stale flag cleanup. It also adds API-accessible explorations and bandits without sticky bucketing.

    Experimentation: An API-first lifecycle

    We got a little carried away. 417 pull requests and 376,798 lines later, GrowthBook 4.4 delivers a rebuilt API with broad coverage to support programmatic use of GrowthBook, a conversational AI Data Analyst for self-serve product analytics, and ramp schedules that make targeted, time-based rollouts repeatable and safe.

    Writing software with AI agents is the new standard, and this release fully positions teams to take advantage of new approaches to developing and shipping products with AI. We’ve rebuilt the foundations of our API and extended its coverage so that teams can work with GrowthBook however they choose, whether through the UI or agentic tools. We work with the leading AI companies, and we’re excited to keep building out features that make it easier to ship faster, more confidently.

    Version 4.4 is available immediately to both our cloud and self-hosted users.

    Experimentation has assumed a human is in the loop at every step:

    • Writing the hypothesis
    • Configuring the test
    • Reading the results
    • Deciding what to ship.

    That assumption is breaking down.

    Teams are pushing experimentation into agent-assisted and automated workflows. With GrowthBook 4.4, you can work programmatically at any stage of the lifecycle you choose to, from setup and operation to decision.

    In 4.4, the REST API surface was rebuilt on Zod-driven endpoints. Zod is a schema library: define the shape of an API once, and the TypeScript types, runtime validation, and OpenAPI docs all generate from the same definition. They can't drift apart. That makes the experiment lifecycle a real programmable workflow instead of a collection of endpoints with a spec that lags behind reality.

    Set up and launch experiments from a template

    Templates encode your organization's standards: statistical methods, metric selection, guardrails, and the defaults a practitioner would otherwise select manually. Agents create experiments against those templates over REST. You get consistent experiment design at agent speed, with your rigor baked in.

    In 4.4, an agent can pull the pre-launch checklist, see what passes and fails, and mark manual items complete. That covers automated checks for metrics defined, variations configured, and targeting set, plus any manual checks your team has added.

    Observe experiments at the scale of your program

    Instead of clicking through the UI to view every running experiment, your agents make a single endpoint round-trip. That call returns estimated completion times, metric direction, and any guardrail issues for every experiment at once. Snapshots, status, and results work the same way: one call across the program, not one per experiment.

    Analyze, decide, and ship

    Reports are a new first-class API resource in 4.4. An agent can build a persistent analysis from an experiment, customizing metric overrides, analysis settings, or dimension breakdowns. The report stays put, and doesn't get overwritten by the next snapshot. An agent can hand a report URL to a stakeholder, refresh it on demand, and link back to it from a write-up.

    From the report, the agent reads results against your team's decision criteria and concludes the experiment programmatically, setting the winner, results, and analysis summary in one call.

    Bandits without sticky bucketing

    Multi-armed bandits fit when you want continuous learning and traffic reallocation without the time cost of a full A/B test. That profile shows up frequently in AI application work: model selection, prompt comparison, response ranking. Until now, adopting bandits in GrowthBook meant setting up sticky bucketing first. Now, bandits are easier to set up, and we expect that teams building AI-powered products will use them extensively.

    Product Analytics: Structured exploration with an AI-native interface‍

    Analytics has three different bottlenecks, and traditional tools built around SQL and a dashboarding UI don't address any of them well.

    • Non-data teams need to self-serve analysis on product metrics without queuing behind the data team.
    • Repetitive review workflows run the same explorations over and over.
    • Agents in the loop need programmatic access to metric data data, fact tables, and experiment results.

    All three converge at the same place: the data team becomes a bottleneck for work that structured, repeatable tooling should handle. GrowthBook 4.4 introduces a new AI Data Analyst to improve self-serve analysis, and makes it explore data without SQL.

    AI Data Analyst (beta)

    The AI Data Analyst is a conversational AI assistant for Product Analytics. Ask a question in plain language, like “What’s our DAU trend by country” or “How is our model playground engagement rate compared to API key creation,” and get back charts and insights built with tooling specific to GrowthBook’s metrics, fact tables, and data sources. Product managers, marketers, and others can self-serve their questions without writing SQL or filing tickets with the data team.

    It's in beta, and we want to hear how you're using it. Ask it real questions, try out your actual workflows, and let us know what we can improve.

    Explorations

    Suppose you want to understand the change in daily active users of a new search feature, broken out by plan tier. GrowthBook’s Explorer lets you create charts (Explorations) from your warehouse data through a visual interface. Select your metric, dimensions, and date range to generate a chart that you can share or save. Or run the same exploration over the API and get the chart data back, plus a deep link to open it in GrowthBook.

    Product Analytics over MCP and API

    Every exploration is now addressable through the MCP server and REST API. Queries run the same way whether they come from a person clicking through the UI or an automated workflow calling the endpoint.

    Feature flagging: Safer flags, for whoever's shipping them

    AI-assisted software development has rapidly increased the amount of code written and features deployed. Teams need flag infrastructure and governance that scale with the automation, so that problems like premature rollouts and flag sprawl don’t scale right alongside feature development.

    GrowthBook 4.4 includes safety and governance features to prevent these failure types, and automated stale flag cleanup to mitigate flag sprawl in existing codebases.

    Ramp schedules

    Targeted releases now include automated, time-based rollouts with fine-grained controls and optional approval steps. Each step supports its own targeting in addition to a traffic percentage, so a canonical rollout might look like:

    1. Internal employees for 24 hours
    2. Free tier at 25%
    3. Free tier at 100%
    4. Paid tier at 25%
    5. Paid tier at 100%,

    This is particularly useful for AI model and prompt deployments, where you want time between each exposure before widening the blast radius. With monitored ramp schedules, you can attach guardrail metrics and automatically hold, advance, or roll back releases. With reusable templates, teams codify standard rollout shapes they want to apply to new features quickly.

    Revisions and approval controls

    Approval flows are built directly into the flag change lifecycle, with REST coverage (in beta) for revisions, reviews, and approvals. When a change comes in, it goes through the same review gate regardless of origin. The same guardrails apply whether the change comes from a teammate or an automated process.

    Fully automated stale flag cleanup

    Engineering teams often find their codebases bloated with quietly accumulated stale flags, which drive significant technical debt. Cleanup is handled in occasional tech debt sprints or just perpetually deferred.

    This release adds an endpoint for stale flag detection, which lets an AI agent run an entire cleanup process. An agent or tool can pull stale flags by API, provide a reviewable list, then find and clean up flag references in your codebase as you decide what to remove. Automating the cleanup process removes a major source of compounding debt, especially for large enterprise teams.

    Read more on what's changed, including feature flag change comparisons, audit logs, and approval workflows in our feature flag deep dive.

    Ship, test, measure, and decide at a new pace

    GrowthBook was built for teams that ship product with discipline: engineering teams with release processes, data teams defining metrics and guardrails, and product teams trying to move fast without losing rigor. Our 4.4 release extends the same principles to the agents now working alongside all of them.

    The changes across experimentation, feature flags, and product analytics connect. A programmable experiment lifecycle, analytics you can query by agent, and flag infrastructure with real approval controls form one continuous workflow: ship, test, measure, decide. Whether a product team runs it manually or an automated pipeline runs it continuously, the underlying system works the same way.

    Faster iteration. Same rigor and safety.

    What's next

    • See the full 4.4 release notes →
    • Read the feature flagging deep dive →
    • Talk to our team →
    Original source