STT Configuration¶

File: configs/stt.yaml
Command: kenzy-stt [config_path]

The STT service accepts POST requests with base64-encoded PCM audio and returns a transcript. It is built on faster-whisper, a CTranslate2-optimized implementation of OpenAI Whisper.

Pulled from the server

kenzy-stt pulls this config from the server at boot — it discovers the server via mDNS (or KENZY_SERVER_URL) and blocks until it answers, so start the server first. Edit it from the dashboard's Services tab (writes configs/services/stt.yaml on the server and restarts the service). Passing an explicit path (kenzy-stt configs/stt.yaml) loads locally instead — a dev/offline escape hatch. See central config for backend services.

Full reference¶

Key	Default	Description
`host`	`"127.0.0.1"`	Bind address
`port`	`8767`	HTTP port
`log_level`	`"info"`	What the service prints to its console
`log_capture_level`	`"debug"`	How deep the dashboard log viewer can see (`trace`/`debug`/…), independent of `log_level`

Whisper model¶

Key	Default	Description
`whisper.model`	`"tiny"`	Model size: `tiny`, `base`, `small`, `medium`, `large-v2`, `large-v3`. Larger models are more accurate but slower and need more RAM.
`whisper.device`	`"cpu"`	Inference device: `cpu` or `cuda`
`whisper.compute_type`	`"int8"`	Quantisation: `int8` (fastest on CPU), `float16` (GPU), `float32` (highest quality)
`whisper.language`	`"en"`	Language code (e.g. `"en"`, `"fr"`), or `null` for auto-detect

Model size guide¶

Model	Size	Relative speed	Notes
`tiny`	~75 MB	Fastest	Good for fast hardware or simple commands
`base`	~145 MB	Fast	Better accuracy, still CPU-friendly
`small`	~460 MB	Moderate	Good balance for a dedicated CPU server
`medium`	~1.5 GB	Slow on CPU	Recommended with a GPU
`large-v3`	~3 GB	Slow	Best accuracy; GPU strongly recommended

Run STT off the node

Don't run STT on a room-node board (Orange Pi Zero 3 / Raspberry Pi 3–5) — run it on a more powerful server and point stt.url in server.yaml at it. The tiny or base model on a modern x86 CPU gives acceptable latency.

Example¶

host: "127.0.0.1"
port: 8767

whisper:
  model: "base"
  device: "cpu"
  compute_type: "int8"
  language: "en"