Guide
Whisper on Mac: From CLI to One-Click Dictation
Whisper is the best speech recognition model available. Here's how to run it on your Mac — the easy way and the hard way.
What Is Whisper?
Whisper is a speech recognition model originally developed by OpenAI. It's open source, remarkably accurate across dozens of languages, and can run entirely on local hardware — no cloud required. It quickly became the go-to model for anyone who cares about accuracy and privacy.
On Mac, Whisper is most commonly used through whisper.cpp, a high-performance C/C++ implementation that runs natively on Apple Silicon. It's fast, efficient, and takes full advantage of the Neural Engine in M1/M2/M3/M4 chips.
The CLI Way (whisper.cpp)
If you're technical, you can install whisper.cpp via Homebrew and run transcriptions from the command line:
- Install: brew install whisper-cpp
- Download a model (e.g., large-v3-turbo)
- Record audio to a file
- Run: whisper-cpp -m model.bin -f audio.wav
- Copy the output text and paste it where you need it
This works, and the transcription quality is excellent. But it's a multi-step process: record, transcribe, copy, paste. There's no real-time dictation, no cursor insertion, and no vocabulary learning. It's a tool for developers, not a product for everyone.
The App Way (Arugula)
Arugula wraps the same whisper.cpp engine in a native Mac app. The difference is the experience:
- Hold a key (Right Option by default)
- Speak
- Release — text appears at your cursor
That's it. No recording files, no terminal, no copy-paste. Arugula lives in your menu bar and works in every app — email, Slack, Google Docs, your code editor, anywhere you can place a cursor.
whisper.cpp (CLI)
- Requires Homebrew + terminal
- File-based transcription
- Manual copy-paste workflow
- No real-time dictation
- No vocabulary learning
- Free and open source
Arugula (Mac app)
- Install and go — no setup
- Real-time hold-to-talk
- Inserts text at cursor
- Works in every app
- Learns your vocabulary
- Free
Why Run Whisper Locally?
There are cloud services that use Whisper (or similar models) to transcribe audio. They're accurate, but they require sending your voice to someone else's servers. Running Whisper locally on your Mac gives you:
- Complete privacy — Your audio never leaves your machine. No servers, no data retention policies, no breach risk.
- Zero latency — No network round-trip. Transcription starts the moment you stop speaking.
- Offline capability — Works on airplanes, in basements, at cabins with no Wi-Fi.
- No subscription — Cloud transcription APIs charge per minute. Local is free forever.
Performance on Apple Silicon
Whisper runs exceptionally well on Apple Silicon Macs. The M1 chip was good; the M2, M3, and M4 are even faster. Here's what to expect:
- Transcription speed — Under 1 second for typical dictation (10-30 second clips) on any Apple Silicon Mac
- Memory usage — The large-v3-turbo model uses about 1.5 GB of RAM when loaded
- Battery impact — Minimal when idle. Brief CPU burst during transcription, then back to idle.
- Model loading — First transcription after launch takes a few seconds to load the model. Subsequent transcriptions are near-instant.
Same model, better experience
Arugula uses the same Whisper model as whisper.cpp — the same accuracy, the same privacy, the same local processing. The difference is that Arugula turns it into a dictation app: hold a key, speak, release. Your words appear wherever you're typing. No terminal required.
What About Other Whisper Mac Apps?
Several apps have built on Whisper for Mac transcription. What makes Arugula different:
- Vocabulary learning — Arugula watches your corrections and builds a personal vocabulary over time. Other Whisper wrappers treat every session as the first.
- Hold-to-talk — Most Whisper apps use file-based or toggle-based recording. Arugula's hold-to-talk model is faster and more precise.
- Completely free — No subscription, no in-app purchases, no premium tier.
- Zero telemetry — Some Whisper apps collect usage data or require accounts. Arugula collects nothing.
For a detailed comparison across all dictation options, see our full comparison page.
System Requirements
- macOS 15 (Sequoia) or later
- Apple Silicon (M1, M2, M3, or M4)
Intel Macs can run whisper.cpp from the CLI but won't run Arugula. For the best Whisper experience on Mac, Apple Silicon is required. Learn more about how speech recognition works on Mac.