Ollama Release Notes

Follow

26 release notes curated from 1 source by the Releasebot Team. Last updated: Jun 8, 2026

Get this feed:
  • Jun 8, 2026
    • Date parsed from source:
      Jun 8, 2026
    • First seen by Releasebot:
      Jun 8, 2026
    • Modified by Releasebot:
      Jun 9, 2026
    Ollama logo

    Ollama

    v0.30.7

    Ollama launches Hermes Desktop support, bringing a native desktop interface for the Hermes agent. The update also adds Windows config path support, aligns the OpenAI-compatible API model list with available tags, and refreshes documentation and Zod schema examples.

    Ollama Launch now supports Hermes Desktop, a native desktop interface for the Hermes agent. Run it alongside your Hermes agent to get a visual interface for managing conversations, integrations, and messaging apps.

    ollama launch hermes-desktop
    

    What's Changed

    Hermes Desktop is now available via ollama launch hermes-desktop with native Windows configuration path support

    OpenAI-compatible API models list now aligns with available model tags

    Added documentation describing the llama.cpp update process

    Updated Zod schema examples to use the native toJSONSchema helper

    Full Changelog: v0.30.6...v0.30.7

    Original source
  • Jun 6, 2026
    • Date parsed from source:
      Jun 6, 2026
    • First seen by Releasebot:
      Jun 7, 2026
    Ollama logo

    Ollama

    v0.30.7-rc1

    Ollama aligns the OpenAI models list with tags.

    openai: align models list with tags (#16556)

    Original source
  • All of your release notes in one feed

    Join Releasebot and get updates from Ollama and hundreds of other software products.

    Create account
  • Jun 7, 2026
    • Date parsed from source:
      Jun 7, 2026
    • First seen by Releasebot:
      Jun 7, 2026
    Ollama logo

    Ollama

    v0.30.4

    Ollama releases Nemotron-3-Ultra and improves performance and reliability across Apple Silicon, Windows, Codex, and Pi workflows, with updated llama.cpp backend support and better handling for MLX-based models and package migration.

    New models

    Nemotron-3-Ultra: NVIDIA Nemotron 3 Ultra is built for high-throughput reasoning and long-running agent workflows.

    What's Changed

    Fixed multimodal models not using GPU on the llama.cpp backend can now use Metal GPU offload on Apple Silicon, improving multimodal performance on supported Macs.

    ollama create --experimental now respects REQUIRES in Modelfiles for MLX-based models.

    ollama launch codex now cleans up old conflicting Codex profile config before launching.

    ollama launch pi now migrates users from the legacy Pi package to the official package and preserves the correct npm install prefix.

    Pi web search setup now updates only when a newer package is available.

    Windows cleanup now terminates the llama.cpp backend more reliably.

    Updated the llama.cpp backend.

    Known Issues

    gemma4:12b crashes with floating point exception

    Full Changelog: v0.30.3...v0.30.4

    Original source
  • Jun 6, 2026
    • Date parsed from source:
      Jun 6, 2026
    • First seen by Releasebot:
      Jun 6, 2026
    Ollama logo

    Ollama

    v0.30.7-rc0

    Ollama launches native Windows Hermes config path support.

    launch: use native Windows Hermes config path (#16558)

    Original source
  • Jun 7, 2026
    • Date parsed from source:
      Jun 7, 2026
    • First seen by Releasebot:
      Jun 5, 2026
    • Modified by Releasebot:
      Jun 7, 2026
    Ollama logo

    Ollama

    v0.30.6

    Ollama adds Gemma 4 QAT weights and improves Apple Silicon quantization with MLX embedding layer updates.

    New models

    Gemma 4 QAT weights: the Gemma 4 family is now optimized with Quantization-Aware Training (QAT) to dramatically reduce memory requirements and maximize on-device performance. Look for the tags ending in -qat:

    • gemma4:e2b-it-qat
    • gemma4:e4b-it-qat
    • gemma4:12b-it-qat
    • gemma4:26b-a4b-it-qat
    • gemma4:31b-it-qat

    What's Changed

    ollama launch omp now integrates with Oh My Pi, an AI coding agent with IDE integration

    MLX embedding layers now use NVFP4 global scale for improved quantization on Apple Silicon

    Full Changelog: v0.30.5...v0.30.6

    Original source
  • Jun 5, 2026
    • Date parsed from source:
      Jun 5, 2026
    • First seen by Releasebot:
      Jun 5, 2026
    Ollama logo

    Ollama

    v0.30.6-rc0

    Ollama launches oh-my-pi.

    launch: oh-my-pi (#16410)

    Original source
  • Jun 7, 2026
    • Date parsed from source:
      Jun 7, 2026
    • First seen by Releasebot:
      Jun 5, 2026
    • Modified by Releasebot:
      Jun 7, 2026
    Ollama logo

    Ollama

    v0.30.5

    Ollama fixes a gemma4:12b crash, improves launch support for Hermes Desktop and Windows installs, and adds Cline CLI docs.

    What's Changed

    Fixed the gemma4:12b floating point exception crash on x86, CUDA, Linux, and Windows systems.

    ollama launch hermes-desktop now launches Hermes Desktop and can skip rebuilding when a packaged desktop app is already installed.

    ollama launch hermes now supports native Windows installs through the Hermes PowerShell installer.

    Added Cline CLI integration docs.

    Full Changelog: v0.30.4...v0.30.5

    Original source
  • Jun 4, 2026
    • Date parsed from source:
      Jun 4, 2026
    • First seen by Releasebot:
      Jun 5, 2026
    Ollama logo

    Ollama

    v0.30.5-rc0: llama.cpp version update (#16511)

    Ollama bumps llama.cpp to b9509, fixing Gemma 4 12B multimodal crashes across x86, CUDA, Linux, and Windows.

    Bump llama.cpp to b9509, which includes the upstream Gemma 4 12B multimodal projector fixes for the n_head=0 divide-by-zero crash seen on x86/CUDA/Linux/Windows.

    Fixes #16479

    Fixes #16489

    Fixes #16491

    Fixes #16492

    Fixes #16495

    Original source
  • Jun 4, 2026
    • Date parsed from source:
      Jun 4, 2026
    • First seen by Releasebot:
      Jun 4, 2026
    Ollama logo

    Ollama

    v0.30.4

    Ollama ships a small update with a llama.cpp version bump and a Windows cleanup fix for llama-server.

    What's Changed

    llama.cpp version update by @dhiltgen in #16463

    Kill llama-server during Windows cleanup by @dhiltgen in #16458

    Known Issues

    gemma4:12b crash with floating point exception

    Full Changelog: v0.30.3...v0.30.4

    Original source
  • Jun 3, 2026
    • Date parsed from source:
      Jun 3, 2026
    • First seen by Releasebot:
      Jun 4, 2026
    Ollama logo

    Ollama

    v0.30.4-rc1: llama-server: fix gemma4 patch wiring (#16477)

    Ollama fixes the clip.cpp:4399 unknown projector type crash.

    This will fix the "clip.cpp:4399: Unknown projector type" crash.

    Original source
  • Jun 3, 2026
    • Date parsed from source:
      Jun 3, 2026
    • First seen by Releasebot:
      Jun 4, 2026
    Ollama logo

    Ollama

    v0.30.4-rc0: Kill llama-server during Windows cleanup (#16458)

    Ollama fixes Windows cleanup so llama-server.exe no longer stays running after ollama.exe is killed directly.

    Windows installer and app cleanup could leave llama-server.exe running when ollama.exe was killed directly, so cleanup now includes llama-server.exe and taskkill /T.

    Original source
  • Jun 7, 2026
    • Date parsed from source:
      Jun 7, 2026
    • First seen by Releasebot:
      Jun 4, 2026
    • Modified by Releasebot:
      Jun 7, 2026
    Ollama logo

    Ollama

    v0.30.3

    Ollama adds gemma4:12b support for high-performance multimodal AI that runs directly on laptops.

    New models

    Gemma 4 12B: high-performance multimodal intelligence that runs directly on laptops, combining efficiency with advanced reasoning.

    What's Changed

    Added support for gemma4:12b.

    Full Changelog: v0.30.2...v0.30.3

    Original source
  • Jun 3, 2026
    • Date parsed from source:
      Jun 3, 2026
    • First seen by Releasebot:
      Jun 3, 2026
    Ollama logo

    Ollama

    v0.30.2

    Ollama ships launch and model-serving improvements, including Cline CLI auto-install, Qwen code integration, better local model limits, stronger troubleshooting logs, and fixes for markdown URL handling, server stalls, and llama.cpp compatibility.

    What's Changed

    • feat(launch): show and auto-install Cline CLI by @hoyyeva in #16402
    • log template details to aid troubleshooting by @dhiltgen in #16403
    • cmd/launch: add Qwen code integration by @hoyyeva in #15900
    • launch: fix opencode local model limits by @dhiltgen in #16425
    • llm: include cached prompt tokens in llama-server counts by @dhiltgen in #16428
    • Harden app markdown URL handling by @dhiltgen in #16380
    • discover: allow Radeon 8060S iGPU by default by @dhiltgen in #16429
    • llm: detect llama-server load stalls from output by @dhiltgen in #16427
    • More harden app markdown URL handling by @dhiltgen in #16436
    • llama.cpp version update by @dhiltgen in #16426
    • launch: isolate Codex launch configuration by @ParthSareen in #16437
    • llama: add laguna (poolside) arch via a llama.cpp patch under llama/c… by @dhiltgen in #16396
    • docs: configure hermes desktop app by @BruceMacD in #16440
    • llm: ignore llama-server SSE ping comments by @dhiltgen in #16443
    • fix laguna patch build breakage by @dhiltgen in #16445

    Full Changelog: v0.30.0...v0.30.2-rc0

    Original source
  • Jun 2, 2026
    • Date parsed from source:
      Jun 2, 2026
    • First seen by Releasebot:
      Jun 3, 2026
    Ollama logo

    Ollama

    v0.30.2-rc0: fix laguna patch build breakage (#16445)

    Ollama fixes kernel template instantiation so library symbols are exported correctly.

    Follow up to #16396

    Fix kernel template instantiation so the symbols are exported in the library.

    Original source
  • Jun 2, 2026
    • Date parsed from source:
      Jun 2, 2026
    • First seen by Releasebot:
      Jun 3, 2026
    Ollama logo

    Ollama

    v0.30.1: llm: ignore llama-server SSE ping comments (#16443)

    Ollama fixes SSE streaming by skipping idle comment frames in completion and chat requests.

    llama.cpp b9478 added a default 30s SSE ping that emits colon-only comment frames (":\n\n") while streamed requests are idle; Ollama treated non-data SSE lines as JSON, so skip SSE comments in completion and chat streams.

    Original source
Releasebot

Curated by the Releasebot team

Releasebot is an aggregator of official release notes from hundreds of software vendors and thousands of sources.

Our editorial process involves the manual review and audit of release notes procured with the help of automated systems.

Similar to Ollama with recent updates: