TEN VAD is a native C library from the TEN Framework that provides frame-level speech probability scores with a fixed 256-sample hop size (16 ms at 16 kHz). Provider identifier:Documentation Index
Fetch the complete documentation index at: https://doc.rapida.ai/llms.txt
Use this file to discover all available pages before exploring further.
ten_vad
Source Location
How It Works
- Incoming LINEAR16 bytes are converted to
int16samples - Samples are processed in fixed 256-sample frames (16 ms each)
- Each frame produces a speech probability score
- Speech onset/offset is tracked with the same hysteresis logic as Silero — speech ends only when probability drops below
threshold - 0.15 - Same packet emission pattern:
InterruptionPacketon onset,VadSpeechActivityPacketheartbeats during speech
Parameters
| Option Key | Default | Range | Description |
|---|---|---|---|
microphone.vad.threshold | 0.5 | 0.3–1.0 | Speech probability threshold |
microphone.vad.min_silence_frame | 20 | 1–30 | Silence frames before segment end (× 10 ms) |
microphone.vad.min_speech_frame | 8 | 1–20 | Speech frames before segment start (× 10 ms) |
Shared Library
TEN VAD does not use an ONNX model. It requires thelibten_vad shared library at both build and runtime.
Docker: The Dockerfile copies the library from the source tree: