Baseten Release Notes

Last updated: Mar 20, 2026

  • Mar 19, 2026
    • Date parsed from source:
      Mar 19, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Introducing the Baseten Delivery Network (BDN)

    Baseten launches the Baseten Delivery Network, making cold starts 2-3x faster for large models with smarter weight delivery, multi-tier caching, and fewer upstream dependencies.

    We just launched the Baseten Delivery Network (BDN), designed to make cold starts 2-3x faster for large models.

    BDN solves three root causes of slow cold starts: slow weight pulls from upstream storage, replica stampedes under load, and upstream availability dependencies. On first deployment, BDN mirrors your weights to secure storage. From there, a multi-tier cache (node cluster mirrored origin) serves weights with consistent hashing and single-flight semantics: each file fetched once per cluster, not once per pod. Fine-tunes sharing weights with a base model only pull the delta.

    Check out the launch blog to learn more, or see the docs to get started.

    Original source Report a problem
  • Mar 16, 2026
    • Date parsed from source:
      Mar 16, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Regional environments

    Baseten adds regional environments to keep inference traffic in-region for data residency and GDPR compliance.

    Route inference traffic exclusively within a designated geographic region to meet data residency and compliance requirements like GDPR.

    Regional environments use a dedicated endpoint format that guarantees traffic stays in-region:

    https://model-{model_id}-{env_name}.api.baseten.co/predict
    

    Contact [email protected] to set up regional environments. For more information, see Regional environments.

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from Baseten and hundreds of other software products.

  • Mar 13, 2026
    • Date parsed from source:
      Mar 13, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    CI/CD for model deployments

    Baseten adds the Truss Push Action to automate deployments, validate pull requests, and stream deploy logs in GitHub Actions.

    Automate Truss deployments with the Truss Push Action. Deploy on merge, validate on pull request, or deploy multiple models in parallel.

    The action streams deployment logs directly into GitHub Actions, validates models, and writes a summary of deploy time metrics.

    To get started, see the CI/CD docs.

    Original source Report a problem
  • Mar 7, 2026
    • Date parsed from source:
      Mar 7, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Truss 0.15.2

    Baseten adds a --no-cache flag to truss push for full rebuilds without cached Docker layers.

    Added --no-cache flag to truss push to force a full rebuild without using cached Docker layers. This is useful when debugging build issues or ensuring a clean image. The flag is CLI-only and cannot be set in config.yaml.

    For more information, see Truss or the documentation.

    Original source Report a problem
  • Mar 6, 2026
    • Date parsed from source:
      Mar 6, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Environment-scoped API keys

    Baseten adds API key restrictions for specific environments and models to tighten team access control.

    You can now restrict API keys to specific environments and models, giving you more control over how your team accesses Baseten resources.

    When creating a team key with Manage permissions, use the new Environment access dropdown to limit which environments the key can reach. This works for both "call all team models" and "call certain models" permission levels.

    To enable this feature for your workspace, reach out to [email protected].

    For more information, see the API keys documentation.

    Original source Report a problem
  • Mar 4, 2026
    • Date parsed from source:
      Mar 4, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Retrieve billing usage via API

    Baseten adds a new billing usage summary API that lets users query cost breakdowns programmatically across Dedicated Inference, Training, and Model APIs. It includes aggregate totals, daily granularity, and historical attribution for deleted resources.

    You can now query your billing usage programmatically using the new GET /v1/billing/usage_summary endpoint. Pass a date range of up to 31 days to get a breakdown of costs across Dedicated Inference, Training, and Model APIs.

    The response includes aggregate totals and a per-resource or per-model breakdown[] array, with daily granularity on each entry. Deleted resources are still returned in breakdown[] with is_deleted: true, so historical cost attribution is preserved.

    Example output:

    {
      "dedicated_usage": {
        "subtotal": 123,
        "credits_used": 123,
        "total": 123,
        "minutes": 123,
        "breakdown": [{
          "billable_resource": {
            "id": "<string>",
            "kind": "MODEL_DEPLOYMENT",
            "name": "<string>",
            "is_deleted": true,
            "instance_type": "<string>",
            "environment_name": "<string>"
          },
          "subtotal": 123,
          "minutes": 123,
          "inference_requests": 123,
          "daily": [{
            "date": "2023-12-25",
            "subtotal": 123,
            "minutes": 123,
            "inference_requests": 123
          }]
        }]
      },
      "training_usage": {
        "subtotal": 123,
        "credits_used": 123,
        "total": 123,
        "minutes": 123,
        "breakdown": [{ ... }]
      },
      "model_apis_usage": {
        "subtotal": 123,
        "credits_used": 123,
        "total": 123,
        "breakdown": [{
          "model_name": "<string>",
          "model_family": "<string>",
          "subtotal": 123,
          "input_tokens": 123,
          "output_tokens": 123,
          "cached_input_tokens": 123,
          "daily": [{
            "date": "2023-12-25",
            "subtotal": 123,
            "input_tokens": 123,
            "output_tokens": 123
          }]
        }]
      }
    }
    

    Check out the billing API reference for the full schema and parameters.

    Original source Report a problem
  • Mar 4, 2026
    • Date parsed from source:
      Mar 4, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Truss support for pyproject.toml and uv.lock

    Baseten now supports pyproject.toml and uv.lock as dependency formats in Truss and Chains configs.

    Truss now supports pyproject.toml and uv.lock as dependency formats in addition to requirements.txt. You can use any of these formats as the requirements_file in your Truss and Chains config. For example:

    model_name: My Model
    resources:
      accelerator: A10G
      cpu: "4"
      memory: 16Gi
    requirements_file: ./pyproject.toml
    

    For more information, see the Truss configuration documentation.

    Original source Report a problem
  • Mar 2, 2026
    • Date parsed from source:
      Mar 2, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Deployment labels on push

    Baseten adds deployment labels at push time for easier search and filtering in the UI and API.

    You can now attach labels to deployments at push time using the --labels flag. Labels are key-value pairs passed as a JSON string that are stored with the deployment.

    Sample usage

    truss push --labels '{
    "env": "staging",
    "team": "ml-platform",
    "version": "1.2.0"
    }'
    

    Labeled deployments can be searched and filtered in the Baseten UI and API. Check out our CLI docs for the full list of flags.

    Original source Report a problem
  • Feb 26, 2026
    • Date parsed from source:
      Feb 26, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Truss upgrades and rollbacks

    Baseten's Truss adds CLI self-upgrades with a new truss upgrade command, package manager detection, and support for version-specific upgrades or rollbacks. It also now alerts users when a new version is available during normal commands like truss push.

    Truss can now upgrade itself directly from the CLI. Use the new truss upgrade command to update to the latest version. Truss will detect your package manager (supports uv, pip, pipx, and anaconda) and ask for confirmation before proceeding.

    You can also upgrade or roll back to a specific version by passing it as an argument: truss upgrade 0.14.0.

    Truss will also now notify you when a new version is available, displayed right at the start of normal commands like truss push.

    Version checks run once daily and can be disabled by setting check_for_updates = false under [preferences] in $HOME/.config/truss/settings.toml.

    Original source Report a problem
  • Feb 25, 2026
    • Date parsed from source:
      Feb 25, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    Baseten logo

    Baseten

    Monitor concurrent inference requests

    Baseten adds a Concurrent Requests graph to track in-progress inference requests and autoscaling signals across deployments.

    Track the number of in-progress inference requests across your deployments, including both requests currently being serviced and those waiting in the queue. This is the key indicator used to drive autoscaling decisions, and is now visible in the metrics dashboard and available through metrics export. For more information, see the supported metrics docs and the autoscaling documentation.

    Concurrent Requests Graph

    Original source Report a problem

Related vendors