Question 1

Does Fono work offline?

Accepted Answer

Yes. The default configuration runs whisper.cpp on your machine. Audio never leaves the box unless you explicitly opt into a cloud provider.

Question 2

Can Fono answer questions, or is it dictation-only?

Accepted Answer

Both. F7 is dictation — your words land in the focused window. F8 is the voice assistant — Fono transcribes your question, runs it through its built-in local LLM (llama.cpp compiled in, no extra installs) or a cloud LLM, and speaks the reply through Kokoro, Piper, or Supertonic on-device, or a cloud voice. Same daemon, same hotkey muscle memory, different brain.

Question 3

Does Fono work on Wayland?

Accepted Answer

Yes. Both X11 and Wayland are supported as first-class targets. Paste injection is universal on X11; on Wayland Fono uses the standard portals.

Question 4

Which Linux distributions are supported?

Accepted Answer

Any glibc Linux. Native packages exist for Debian/Ubuntu, Slackware/NimbleX, and Arch/Manjaro. Everything else uses the bare static binary from GitHub Releases.

Question 5

Is there telemetry or analytics?

Accepted Answer

No. None. The daemon makes no analytics calls. The only outbound traffic is to the cloud STT, LLM, or TTS provider that you explicitly configured.

Question 6

Can it start hands-free, without a hotkey?

Accepted Answer

Yes. Fono can idle and listen for a spoken wake phrase; detection runs locally and is off until you enable it. There's also a realtime mode for back-and-forth conversation — just keep talking.

Question 7

Can Fono tell who is speaking?

Accepted Answer

Yes. Enrol a voice once and Fono recognises it on-device, tagging each history entry with the speaker's name. The voiceprint never leaves your machine — only the recognised name is saved.

Question 8

Can I drive coding agents by voice?

Accepted Answer

Yes — early preview. Fono exposes an MCP voice loop: Claude Code, Cursor, Forge, and other MCP-capable agents can take your spoken instructions and talk back when they need input or finish a task.

Question 9

Can other apps and machines use Fono?

Accepted Answer

Yes. Fono serves the Wyoming protocol (Home Assistant discovers it over mDNS as a speech-to-text, text-to-speech, and wake-word provider) and an OpenAI- and Ollama-compatible API on port 11434 for editors, Open WebUI, and anything that speaks those APIs. Inbound API keys gate LAN access.

Question 10

How fast is local, really?

Accepted Answer

On a laptop CPU the local assistant's first spoken word lands in about a third of a second, and the engine runs 2-4x ahead of Ollama on identical weights. The model picker is backed by 900+ benchmark runs, not guesses.

Question 11

How do I switch from local Whisper to a cloud provider?

Accepted Answer

Run 'fono use cloud groq' — one key covers STT, polish, assistant, and TTS — or swap a single stage with 'fono use stt deepgram'. Add your API key with 'fono keys add PROVIDER_API_KEY'. No restart needed; 'fono use local' brings everything back home.

Question 12

Are macOS and Windows supported?

Accepted Answer

Yes, as experimental ports. Each release ships a Metal-accelerated Apple Silicon binary and a Windows .exe that uses the GPU when a driver is present. Linux remains the daily-driven primary; issue reports on the new ports are genuinely useful.

Question 13

Can I self-host Fono on my LAN?

Accepted Answer

Yes. Run Fono in server mode on the fastest machine on your network. Other clients, including Home Assistant, discover the server automatically over mDNS.

Question 14

Does Fono use the GPU?

Accepted Answer

Yes. The GPU build uses Vulkan and works on nearly all NVIDIA, AMD, and Intel GPUs from the last few years — the installer picks it automatically when it detects one. With a mid-range card, larger Whisper models transcribe your speech almost instantly.

Question 15

What's the φ in the favicon?

Accepted Answer

The Greek letter phi (φ). 'Fono' comes from the Greek root φων- meaning voice or sound — the same root in phonetics, phonograph, telephone, and symphony.

Speak.
It types.It answers.

Ten things, done well.

Local first.

Cloud fast.

It answers.

It knows your voice.

Agents, by voice.

Hardware aware.

It serves.

Various visualizations.

Web settings.

Open source.

Sixty seconds.

Four small pieces.

Capture

Transcribe

Cleanup

Think

Type

Speak

Runs on your machine.

Things people ask.

Stop typing.
Start saying.

Speak.It types.It answers.