ZenVox
Voice to text. Cleaned by AI. Yours to keep.
Voice-to-text shouldn't cost $16/month
Wispr Flow charges you $16/month to wrap Whisper in a pretty UI and clean your text with an API call. Your audio goes to their cloud. You can't choose your own LLM. You can't go offline. And you definitely can't inspect the source code.
ZenVox does the same thing for $0. Your audio never leaves your machine.
Press a hotkey. Talk. It types.
ZenVox sits in your system tray. When you press the hotkey, it starts recording. When you stop talking, it auto-stops via voice activity detection. Whisper transcribes locally, your chosen LLM cleans the output, and the result is pasted wherever your cursor is.
# 1. Press Ctrl+Alt+F12 Recording... listening via VAD # 2. Speak naturally — ZenVox detects silence and auto-stops Transcribing... Faster-Whisper (large-v3-turbo) # 3. AI cleans the output Cleaning... Gemini Flash Lite (free tier) # 4. Cleaned text auto-pasted at your cursor Done! 247 words cleaned and pasted in 3.2s
ZenVox vs the paid alternatives
| ZenVox | Wispr Flow | |
|---|---|---|
| Price | Free | $8-16/month |
| Audio leaves your machine | Never | Yes (cloud STT) |
| Choose your own LLM | 5 providers | Their API only |
| Fully offline option | Yes (Ollama) | No |
| Cleaning presets | 4 | 1 |
| Bilingual FR/EN | Native | English-centric |
| Searchable history | SQLite | No |
| Open source | Yes | No |
The $0/month voice-to-text stack
No paid API required. Here's what most people use:
| Component | What | Cost |
|---|---|---|
| Faster-Whisper | Speech-to-text, runs locally | Free |
| Gemini Flash Lite | AI text cleaning (1500 req/day free) | Free tier |
| ZenVox | Glues it together | Free, forever |
Want fully offline? Use Ollama as your cleaning provider. Zero API calls, everything stays on your machine.
Everything you need, nothing you don't
Transcription
- Faster-Whisper with 4 model sizes (tiny to large-v3-turbo)
- CUDA GPU acceleration (auto-detected)
- VAD silence detection — auto-stops when you stop talking
- Audio feedback beeps for start/stop
AI Cleaning
- 5 providers: Gemini, OpenAI, Anthropic, Groq, Ollama
- 4 presets: General, Technical, Minimal, Structured
- Per-provider API key storage (secure keyring)
- Fully offline with Ollama — zero external calls
Desktop App
- System tray with colored status dots
- Configurable hotkeys (record + re-paste)
- 3 output modes: auto-paste, clipboard, append to file
- Searchable SQLite transcription history
Built for Quebec
- Native bilingual FR/EN — handles Franglais naturally
- Accent-aware cleaning that preserves Quebec expressions
- One-command install (Python + GPU + dependencies)
- MIT licensed — yours to modify, extend, sell
Up and running in one command
git clone https://github.com/ZenSystemAI/ZenVox.git cd ZenVox python install.py # That's it. Installs venv, dependencies, detects GPU, creates shortcuts. # Launch from Start Menu or run: python zenvox.py
The installer detects your GPU, sets up a virtual environment, installs all dependencies, and creates desktop shortcuts. First-run experience guides you through provider setup.