Piper TTS Integration for Offline Speech Synthesis
Piper — fast open-source neural TTS from Home Assistant team. Works offline, voice models 50–500 MB, CPU inference in real-time. Supports Ukrainian and 40+ other languages. Apache 2.0 license.
Characteristics
- Speed: real-time factor 0.1–0.5 on modern CPU (generation faster than playback)
- Quality: MOS ~3.8/5 for best voices (inferior to ElevenLabs, but acceptable for most use cases)
- Model size: low (30 MB), medium (60 MB), high quality (250 MB+)
Usage
echo "Привіт, це офлайн синтез мовлення." | piper --model uk_UA-tanya-medium.onnx --output_file speech.wav
Python API via piper-phonemize + onnxruntime.
Voices for Ukrainian Language
uk_UA-tanya-medium — female voice, good quality. uk_UA-pavlo-medium — male voice, different timbre. Adding custom voice: requires 1–3 hours of recording + training (Coqui VITS).
Application
Smart home voice notifications, offline chatbot TTS, industrial HMI, embedded systems (Pi, Jetson Nano), corporate systems with data residency requirements.
Comparison with Alternatives
| Piper | Coqui XTTS | ElevenLabs | |
|---|---|---|---|
| Offline | Yes | Yes | No |
| Quality | Good | Excellent | Superior |
| Latency | <100 ms | 200–500 ms | 100–300 ms (API) |
| Custom voice | Difficult | Easy | Easy |







