macOS menu bar · local inference

Local models,
one click away.

Dakodeon lives in your menu bar. Pick a curated model, start the local server, and point any agent at an OpenAI-compatible API — no runtime to bundle, no config to babysit.

$ brew install --cask emin93/tap/dakodeon
􀙇 100% Mon 9:41
Dakodeon Running
MODEL
Gemma4 12B Coder
Settings Logs Quit
Model manager

Download, switch, delete.

A built-in Settings window tracks every curated model. Watch downloads in real time, free up disk in a click, and switch models without touching a terminal.

Live download status

Every model shows progress, size, and state — pulled straight from the shared Hugging Face cache.

Instant switching

Choose a model and the server stops and relaunches with its parameters — drafts, MTP, and all.

Reclaim space

Delete weights you're done with. One confirm and the backing files leave the cache.

Settings
Models
Gemma4 12B CoderActive
12B · Q8_0 · MTP draft · 13.1 GB
Downloading 61%7.8 GB / 13.1 GB
Cancel
Why Dakodeon

Small app. Real server.

OpenAI-compatible API

Serves llama-server at 127.0.0.1:8080/v1 with a stable local alias for agents like OpenCode.

No bundled runtime

Uses the system llama.cpp and hf you already have. The app bundle stays tiny.

Curated profiles

Each profile ships ready to run — weights, draft model, and tuned flags. No knobs to get wrong.

Get started

Install in one line.

$ brew install --cask emin93/tap/dakodeon

Requires llama.cpp and hf on your PATH · macOS 14+