AI/ML Infrastructure Release Notes
Release notes for AI compute platforms, inference clouds and ML tooling
Products (14)
Latest AI/ML Infrastructure Updates
- Apr 11, 2026
- Date parsed from source:Apr 11, 2026
- First seen by Releasebot:Mar 20, 2026
- Modified by Releasebot:Apr 12, 2026
Apr 11
Together AI adds Serverless Model Bring Ups with MiniMaxAI/MiniMax-M2.7 now available.
Serverless Model Bring Ups
The following models have been added:
- MiniMaxAI/MiniMax-M2.7
- Apr 10, 2026
- Date parsed from source:Apr 10, 2026
- First seen by Releasebot:Apr 11, 2026
April 10, 2026
Vertex AI adds generally available SQL cells in Colab Enterprise notebooks for writing and running queries directly.
Feature
Colab Enterprise
SQL cellsGenerally available: You can use SQL cells to write, edit, and run SQL queries directly from your Colab Enterprise notebooks. For more information, see Use SQL cells.
Original source Report a problem All of your release notes in one feed
Join Releasebot and get updates from Together AI and hundreds of other software products.
- Apr 10, 2026
- Date parsed from source:Apr 10, 2026
- First seen by Releasebot:Apr 11, 2026
ZeroGPU overquota
Hugging Face lets PRO users keep using ZeroGPU Spaces beyond quota with prepaid credits at $1 per 10 minutes.
PRO users can now continue using ZeroGPU Spaces above their daily included quota.
Over-quota usage requires purchasing pre-paid credits. The price is $1 per 10 minutes of over-quota ZeroGPU usage.
Original source Report a problem - Apr 9, 2026
- Date parsed from source:Apr 9, 2026
- First seen by Releasebot:Apr 10, 2026
Patch release: v5.5.3
transformers fixes Gemma4 device_map auto support in a small patch release.
Small patch release to fix device_map support for Gemma4! It contains the following commit:
[gemma4] Fix device map auto (#45347) by @Cyrilvallez
Original source Report a problem - Apr 9, 2026
- Date parsed from source:Apr 9, 2026
- First seen by Releasebot:Apr 10, 2026
Patch release: v5.5.2
transformers fixes Gemma4 inference with use_cache=False and improves weight conversion mappings in a small patch.
Small patch dedicated to optimizing gemma4, fixing inference with use_cache=False due to k/v states sharing between layers, as well as conversion mappings for some models that would inconsistently serialize their weight names. It contains the following PRs:
- Add MoE to Gemma4 TP plan (#45219) by @sywangyi and @Cyrilvallez
- [gemma4] Dissociate kv states sharing from the Cache (#45312) by @Cyrilvallez
- [gemma4] Remove all shared weights, and silently skip them during loading (#45336) by @Cyrilvallez
- Fix conversion mappings for vlms (#45340) by @Cyrilvallez
- Apr 9, 2026
- Date parsed from source:Apr 9, 2026
- First seen by Releasebot:Apr 9, 2026
Patch release v5.5.1
transformers ships patch v5.5.1 with Gemma4 and vLLM fixes plus export and integration test improvements.
Patch release v5.5.1
This patch is very small and focuses on vLLM and Gemma4!
- Fix export for gemma4 and add Integration tests (#45285) by @Cyrilvallez
- Fix vllm cis (#45139) by @ArthurZucker
- Apr 8, 2026
- Date parsed from source:Apr 8, 2026
- First seen by Releasebot:Apr 9, 2026
Apr 8
Together AI adds serverless model bring ups for google/gemma-4-31B-it and zai-org/GLM-5.1.
Serverless Model Bring Ups
The following models have been added:
- google/gemma-4-31B-it
- zai-org/GLM-5.1
- Apr 7, 2026
- Date parsed from source:Apr 7, 2026
- First seen by Releasebot:Apr 7, 2026
- Modified by Releasebot:Apr 11, 2026
Agent Traces on the Hub
Hugging Face now supports uploading agent traces to Datasets with auto-detection and a dedicated viewer for sessions, turns and tool calls.
You can now upload traces from your agents (Claude Code, Codex, Pi) directly to Hugging Face Datasets. The Hub auto-detects trace formats and tags your dataset as Traces, with a dedicated viewer for browsing sessions, turns, tool calls, and model responses.
No preprocessing needed, just upload the JSONL files from your local session directories as-is:
Agent
Local session directory
- Claude Code
~/.claude/projects - Codex
~/.codex/sessions - Pi
~/.pi/agent/sessions
Useful for sharing debugging workflows, benchmarking agent behavior across models, or building training data from real coding sessions.
Original source Report a problem - Apr 6, 2026
- Date parsed from source:Apr 6, 2026
- First seen by Releasebot:Apr 7, 2026
- Modified by Releasebot:Apr 11, 2026
April 06, 2026
Vertex AI adds schema-based metadata search in RAG Engine to filter retrieval contexts for corpora and files.
Feature
Generative AI on Vertex AI v1
Metadata search for RAG Engine
Use schema-based metadata search in Vertex AI RAG Engine. You can define a metadata schema for a corpus, attach metadata to files within that corpus, and use this metadata to filter contexts during retrieval. For more information, see Filter with metadata search.
Original source Report a problem - Apr 6, 2026
- Date parsed from source:Apr 6, 2026
- First seen by Releasebot:Apr 6, 2026
Named entity recognition on BEI-Bert
Baseten adds token-classification support for BEI-Bert with /predict_tokens NER and low-latency inference.
BEI-Bert now supports token-classification models for named-entity recognition. Deploy any ForTokenClassification model with the /predict_tokens endpoint and get structured entity predictions with configurable aggregation strategies. NER on BEI-Bert runs with sub-three millisecond client-side latency on L4 GPUs.
For more information, see our blog.
Original source Report a problem