AI/ML Infrastructure Release Notes

Release notes for AI compute platforms, inference clouds and ML tooling

Products (14)

Latest AI/ML Infrastructure Updates

  • Apr 11, 2026
    • Date parsed from source:
      Apr 11, 2026
    • First seen by Releasebot:
      Mar 20, 2026
    • Modified by Releasebot:
      Apr 12, 2026
    Together AI logo

    Together AI

    Apr 11

    Together AI adds Serverless Model Bring Ups with MiniMaxAI/MiniMax-M2.7 now available.

    Serverless Model Bring Ups

    The following models have been added:

    • MiniMaxAI/MiniMax-M2.7
    Original source Report a problem
  • Apr 10, 2026
    • Date parsed from source:
      Apr 10, 2026
    • First seen by Releasebot:
      Apr 11, 2026
    Google logo

    Vertex AI by Google

    April 10, 2026

    Vertex AI adds generally available SQL cells in Colab Enterprise notebooks for writing and running queries directly.

    Feature

    Colab Enterprise

    SQL cells

    Generally available: You can use SQL cells to write, edit, and run SQL queries directly from your Colab Enterprise notebooks. For more information, see Use SQL cells.

    Original source Report a problem
  • All of your release notes in one feed

    Join Releasebot and get updates from Together AI and hundreds of other software products.

  • Apr 10, 2026
    • Date parsed from source:
      Apr 10, 2026
    • First seen by Releasebot:
      Apr 11, 2026
    Hugging Face logo

    Hugging Face

    ZeroGPU overquota

    Hugging Face lets PRO users keep using ZeroGPU Spaces beyond quota with prepaid credits at $1 per 10 minutes.

    PRO users can now continue using ZeroGPU Spaces above their daily included quota.

    Over-quota usage requires purchasing pre-paid credits. The price is $1 per 10 minutes of over-quota ZeroGPU usage.

    Original source Report a problem
  • Apr 9, 2026
    • Date parsed from source:
      Apr 9, 2026
    • First seen by Releasebot:
      Apr 10, 2026
    Hugging Face logo

    transformers by Hugging Face

    Patch release: v5.5.3

    transformers fixes Gemma4 device_map auto support in a small patch release.

    Small patch release to fix device_map support for Gemma4! It contains the following commit:

    [gemma4] Fix device map auto (#45347) by @Cyrilvallez

    Original source Report a problem
  • Apr 9, 2026
    • Date parsed from source:
      Apr 9, 2026
    • First seen by Releasebot:
      Apr 10, 2026
    Hugging Face logo

    transformers by Hugging Face

    Patch release: v5.5.2

    transformers fixes Gemma4 inference with use_cache=False and improves weight conversion mappings in a small patch.

    Small patch dedicated to optimizing gemma4, fixing inference with use_cache=False due to k/v states sharing between layers, as well as conversion mappings for some models that would inconsistently serialize their weight names. It contains the following PRs:

    • Add MoE to Gemma4 TP plan (#45219) by @sywangyi and @Cyrilvallez
    • [gemma4] Dissociate kv states sharing from the Cache (#45312) by @Cyrilvallez
    • [gemma4] Remove all shared weights, and silently skip them during loading (#45336) by @Cyrilvallez
    • Fix conversion mappings for vlms (#45340) by @Cyrilvallez
    Original source Report a problem
  • Apr 9, 2026
    • Date parsed from source:
      Apr 9, 2026
    • First seen by Releasebot:
      Apr 9, 2026
    Hugging Face logo

    transformers by Hugging Face

    Patch release v5.5.1

    transformers ships patch v5.5.1 with Gemma4 and vLLM fixes plus export and integration test improvements.

    Patch release v5.5.1

    This patch is very small and focuses on vLLM and Gemma4!

    • Fix export for gemma4 and add Integration tests (#45285) by @Cyrilvallez
    • Fix vllm cis (#45139) by @ArthurZucker
    Original source Report a problem
  • Apr 8, 2026
    • Date parsed from source:
      Apr 8, 2026
    • First seen by Releasebot:
      Apr 9, 2026
    Together AI logo

    Together AI

    Apr 8

    Together AI adds serverless model bring ups for google/gemma-4-31B-it and zai-org/GLM-5.1.

    Serverless Model Bring Ups

    The following models have been added:

    • google/gemma-4-31B-it
    • zai-org/GLM-5.1
    Original source Report a problem
  • Apr 7, 2026
    • Date parsed from source:
      Apr 7, 2026
    • First seen by Releasebot:
      Apr 7, 2026
    • Modified by Releasebot:
      Apr 11, 2026
    Hugging Face logo

    Hugging Face

    Agent Traces on the Hub

    Hugging Face now supports uploading agent traces to Datasets with auto-detection and a dedicated viewer for sessions, turns and tool calls.

    You can now upload traces from your agents (Claude Code, Codex, Pi) directly to Hugging Face Datasets. The Hub auto-detects trace formats and tags your dataset as Traces, with a dedicated viewer for browsing sessions, turns, tool calls, and model responses.

    No preprocessing needed, just upload the JSONL files from your local session directories as-is:

    Agent

    Local session directory

    • Claude Code ~/.claude/projects
    • Codex ~/.codex/sessions
    • Pi ~/.pi/agent/sessions

    Useful for sharing debugging workflows, benchmarking agent behavior across models, or building training data from real coding sessions.

    Original source Report a problem
  • Apr 6, 2026
    • Date parsed from source:
      Apr 6, 2026
    • First seen by Releasebot:
      Apr 7, 2026
    • Modified by Releasebot:
      Apr 11, 2026
    Google logo

    Vertex AI by Google

    April 06, 2026

    Vertex AI adds schema-based metadata search in RAG Engine to filter retrieval contexts for corpora and files.

    Feature

    Generative AI on Vertex AI v1

    Metadata search for RAG Engine

    Use schema-based metadata search in Vertex AI RAG Engine. You can define a metadata schema for a corpus, attach metadata to files within that corpus, and use this metadata to filter contexts during retrieval. For more information, see Filter with metadata search.

    Original source Report a problem
  • Apr 6, 2026
    • Date parsed from source:
      Apr 6, 2026
    • First seen by Releasebot:
      Apr 6, 2026
    Baseten logo

    Baseten

    Named entity recognition on BEI-Bert

    Baseten adds token-classification support for BEI-Bert with /predict_tokens NER and low-latency inference.

    BEI-Bert now supports token-classification models for named-entity recognition. Deploy any ForTokenClassification model with the /predict_tokens endpoint and get structured entity predictions with configurable aggregation strategies. NER on BEI-Bert runs with sub-three millisecond client-side latency on L4 GPUs.

    For more information, see our blog.

    Original source Report a problem