Baseten Updates & Release Notes

Follow

35 updates curated from 37 sources by the Releasebot Team. Last updated: Jun 17, 2026

Get this feed:
  • Jun 16, 2026
    • Date parsed from source:
      Jun 16, 2026
    • First seen by Releasebot:
      Jun 17, 2026
    Baseten logo

    Baseten

    GLM 5.2 available on Baseten

    Baseten adds GLM 5.2 to its Model APIs, giving users OpenAI-compatible access to Z.ai’s flagship agentic engineering model for long-horizon coding tasks, with dedicated deployments available for larger workloads.

    You can start sending requests to GLM 5.2 today through our Model APIs by calling the OpenAI-compatible endpoint with your Baseten API key. For larger workloads, dedicated deployments are available.

    GLM-5.2 is Z.ai's flagship model for agentic engineering is built to perform well on long-horizon coding tasks. GLM-5.2 runs on the Baseten Inference Stack.

    curl -X POST https://inference.baseten.co/v1/chat/completions \
    -H "Authorization: Api-Key $BASETEN_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
    "model": "zai-org/GLM-5.2",
    "messages": [{"role": "user", "content": "Refactor this function for readability"}]
    }'
    

    For more information and to get started, see our docs.

    Original source
  • Jun 16, 2026
    • Date parsed from source:
      Jun 16, 2026
    • First seen by Releasebot:
      Jun 17, 2026
    Baseten logo

    Baseten

    Kimi K2.7 Code available on Baseten

    Baseten adds Kimi K2.7 Code to its Model APIs, making Moonshot AI’s coding-focused model available through an OpenAI-compatible endpoint. It also offers dedicated deployments for larger workloads and highlights support for long-horizon engineering tasks with a 262K-token context window.

    You can start sending requests to Kimi-K2.7-Code today through our Model APIs by calling the OpenAI-compatible endpoint with your Baseten API key. For larger workloads, dedicated deployments are available.

    Kimi K2.7 Code is Moonshot AI's coding-focused model, built for long-horizon agentic engineering tasks with a 262K-token context window. Kimi-K2.7-Code runs on the Baseten Inference Stack.

    curl -X POST https://inference.baseten.co/v1/chat/completions \
    -H "Authorization: Api-Key $BASETEN_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "moonshotai/Kimi-K2.7-Code",
      "messages": [{
        "role": "user",
        "content": "Write a binary search in Rust"
      }]
    }'
    

    For more information and to get started, see our docs.

    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from Baseten and hundreds of other software products.

    Create account
  • Jun 12, 2026
    • Date parsed from source:
      Jun 12, 2026
    • First seen by Releasebot:
      Jun 17, 2026
    Baseten logo

    Baseten

    New sidebar navigation

    Baseten rolls out a new collapsible sidebar nav for Models, Chains, Model APIs, and Training to speed up navigation and debugging.

    We just rolled out a new sidebar nav across Models, Chains, Model APIs, and Training making it easier to navigate between resources, access environments and deployments, and quickly jump to actions like API Endpoints and Playground. It's also collapsible, so you can maximize screen space when you're viewing logs, metrics, or debugging.

    Check it out by signing in and navigation to a deployed model.

    Original source
  • Jun 11, 2026
    • Date parsed from source:
      Jun 11, 2026
    • First seen by Releasebot:
      Jun 11, 2026
    Baseten logo

    Baseten

    Container restart tracking

    Baseten adds a deployment metrics graph for model container restarts to help spot crashes, OOM kills, and failed health checks.

    Each deployment's metrics dashboard now includes a graph showing model container restarts over time. Spikes point to crashes in your model code, out-of-memory kills, or failed health checks.

    Total restarts

    For more information, see the metrics docs.

    Original source
  • Jun 10, 2026
    • Date parsed from source:
      Jun 10, 2026
    • First seen by Releasebot:
      Jun 10, 2026
    Baseten logo

    Baseten

    vLLM and SGLang metrics

    Baseten adds engine-native metrics for vLLM and SGLang models in Metrics, with export to external observability stacks.

    Baseten now surfaces engine-native metrics for models served with vLLM or SGLang directly in the Metrics tab.

    Baseten automatically detects the engine through your container's code /metrics endpoint, then graphs metrics such as tokens per second, time to first token, KV cache usage, and requests running or queued, no configuration or redeploy required.

    vLLM metrics

    You can also export these metrics to your own observability stack alongside Baseten's standard metrics.

    For more information, see our docs.

    Original source
  • Jun 8, 2026
    • Date parsed from source:
      Jun 8, 2026
    • First seen by Releasebot:
      Jun 10, 2026
    Baseten logo

    Baseten

    Log export to OTLP endpoints

    Baseten adds OTLP/HTTP log streaming to Datadog and Grafana Cloud with near real-time build, deploy and request logs.

    You can now stream your Baseten logs to any OTLP/HTTP backend, including Datadog and Grafana Cloud.

    Add a connection under Settings General, and build logs, deploy and promotion events, and per-request serving logs forward to your endpoint in near real time.

    For more information, see our Docs.

    Original source
  • Jun 4, 2026
    • Date parsed from source:
      Jun 4, 2026
    • First seen by Releasebot:
      Jun 5, 2026
    Baseten logo

    Baseten

    Nemotron Ultra on Baseten

    Baseten adds one-click deployment for NVIDIA Nemotron Ultra in Model APIs, with dedicated deployments for larger workloads. The model now supports a 202K-token context window plus OpenAI and Anthropic SDKs, tool calling, structured outputs, and opt-in reasoning.

    You can deploy Nemotron Ultra in one click with our Model APIs. Dedicated deployments are available for larger workloads. NVIDIA Nemotron Ultra, a 550B-parameter mixture-of-experts model with 55B active parameters, is now available through Model APIs with a 202K-token context window.

    Call it with the OpenAI or Anthropic SDK, with tool calling, structured outputs, and opt-in reasoning.

    curl -X POST https://inference.baseten.co/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Api-Key $BASETEN_API_KEY" \
    -d '{
    "model": "nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B",
    "messages": [
    {
    "role": "user",
    "content": "Implement Hello World in Python"
    }
    ],
    "stream": true,
    "stream_options": {
    "include_usage": true,
    "continuous_usage_stats": true
    },
    "top_p": 1,
    "max_tokens": 1000,
    "temperature": 1,
    "presence_penalty": 0,
    "frequency_penalty": 0
    }' \
    --no-buffer
    

    For more information, see our docs or get started by talking to us.

    Original source
  • Jun 4, 2026
    • Date parsed from source:
      Jun 4, 2026
    • First seen by Releasebot:
      Jun 5, 2026
    Baseten logo

    Baseten

    CHANGELOG / POST

    Baseten adds Activity tab filters by API key, deployment, team member, and event type for faster audit log checks.

    Filter a model or Chain's Activity tab by API key, deployment, team member, or event type to quickly check its audit log.

    For more, read the audit logs docs.

    Original source
  • Jun 1, 2026
    • Date parsed from source:
      Jun 1, 2026
    • First seen by Releasebot:
      Jun 5, 2026
    Baseten logo

    Baseten

    Compliance policy visibility

    Baseten now lets customers view compliance policies and restricted regions directly in the app.

    Customers with compliance-related restrictions, such as HIPAA or GDPR, can now view their compliance policies directly in the Baseten application. The policy appears in your Organization and Team settings under the General tab and in the model environment detail view, showing the compliance-related restrictions and regions to which your inference workloads are restricted. Policies are read-only and managed by Baseten.

    For more information, see docs.

    Original source
  • May 28, 2026
    • Date parsed from source:
      May 28, 2026
    • First seen by Releasebot:
      May 28, 2026
    Baseten logo

    Baseten

    Model API deprecation notice (DeepSeek v3.1, MiniMax M2.5)

    Baseten deprecates the DeepSeek v3.1 and MiniMax M2.5 Model APIs on June 17 and points users to DeepSeek v4 Pro as the recommended replacement for stronger agentic and coding performance.

    The DeepSeek v3.1 and MiniMax M2.5 Model API(s) will be deprecated at 5pm PT on June 17th.

    At that time the model ID(s) will become inactive and return an error for all requests. As open source models advance rapidly, we prioritize serving the highest quality models and deprecate models when stronger alternatives are available.

    We recommend DeepSeek v4 Pro as an alternative. DeepSeek v4 Pro offers strong agentic and coding capabilities. Just swap in the new Model ID deepseek-ai/DeepSeek-V4-Pro prior to the deprecation date. If you’d like to continue using the previous weights, please contact us about a dedicated deployment of the model.

    Original source
  • May 27, 2026
    • Date parsed from source:
      May 27, 2026
    • First seen by Releasebot:
      May 27, 2026
    Baseten logo

    Baseten

    GLM 5.1 available on Baseten

    Baseten adds GLM-5.1 on Model APIs for one-click deployment, dedicated workloads, and OpenAI-compatible access.

    GLM 5.1 is now available on Baseten's Model APIs.

    You can deploy GLM-5.1 in one click with our Model APIs; dedicated deployments are available for larger workloads. GLM-5.1 is Z.ai's flagship model for agentic engineering and is built for performance on long-horizon coding tasks. GLM-5.1 utilizes the Baseten Inference Stack and is accessible via an OpenAI-compatible API endpoint.

    Original source
  • May 14, 2026
    • Date parsed from source:
      May 14, 2026
    • First seen by Releasebot:
      May 15, 2026
    Baseten logo

    Baseten

    SSO and SCIM

    Baseten adds SAML 2.0 sign-in and SCIM 2.0 sync for IdP-based role management.

    Connect Baseten to your identity provider for SAML 2.0 sign-in and SCIM 2.0 directory sync, then assign organization, team, and restricted-environment roles to synced directory groups so permissions follow your IdP.

    Available on the Enterprise plan with just-in-time provisioning, automatic deprovisioning, and optional group-gated admin access.

    For more information, see the docs.

    Original source
  • May 7, 2026
    • Date parsed from source:
      May 7, 2026
    • First seen by Releasebot:
      May 8, 2026
    Baseten logo

    Baseten

    View configs for deployed models

    Baseten adds truss model-config to print deployed model YAML and support JSON output.

    truss model-config --model-id --deployment-id prints the YAML config of a deployed model, returning the original config.yaml when available. Add --output json for the full structured response.

    For more information, see the truss model-config reference.

    Original source
  • May 7, 2026
    • Date parsed from source:
      May 7, 2026
    • First seen by Releasebot:
      May 7, 2026
    Baseten logo

    Baseten

    Browser-based login for Truss

    Baseten adds browser login for truss and new auth subcommands.

    truss login now provides the option to authenticate in your browser using your Baseten login. The truss auth group adds login, logout, and status subcommands; use --remote <name> to specify a remote name at login.

    For more information, see the truss auth reference.

    Original source
  • May 5, 2026
    • Date parsed from source:
      May 5, 2026
    • First seen by Releasebot:
      May 6, 2026
    Baseten logo

    Baseten

    Environment-scoped logs and metrics

    Baseten adds environment-scoped logs and metrics for viewing deployment telemetry in one place.

    View logs and metrics for every deployment in an environment from one place. Select an environment in the Logs or Metrics tab to scope telemetry to that environment instead of a single deployment.

    Env scoped logs and metrics

    For more information, see Logs.

    Original source
Releasebot

Curated by the Releasebot team

Releasebot is an aggregator of official product update announcements from hundreds of software vendors and thousands of sources.

Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.

Similar to Baseten with recent updates: