pipecat_smart_turn_eos
Source Location
How It Works
UserAudioPacketaudio is accumulated in a rolling float32 buffer (max ~5 seconds at 16 kHz)- When a final
SpeechToTextPacketarrives, the model runs inference on the buffered audio - The model outputs a turn-completion probability (0.0 – 1.0)
- If
probability >= threshold→ set timer toquick_timeout(caller is likely done) - If
probability < threshold→ set timer tosilence_timeout(caller is likely still speaking) - Interim transcripts reset the timer to
fallback_timeout - When the timer fires,
EndOfSpeechPacketis emitted
Parameters
| Option Key | Default | Range | Description |
|---|---|---|---|
microphone.eos.threshold | 0.5 | 0.1–0.9 | Turn completion probability threshold |
microphone.eos.quick_timeout | 250 ms | 50–1000 ms | Silence buffer when model says “done” |
microphone.eos.silence_timeout | 2000 ms | 500–5000 ms | Silence duration when model says “still speaking” |
microphone.eos.timeout | 500 ms | 500–4000 ms | Fallback timeout for interim transcripts and inference failures |
Model Setup
Docker
The model is downloaded from Hugging Face during the Docker build and patched for ONNX Runtime compatibility. No manual action required.From Source
Download the model manually:libonnxruntime) — same dependency as Silero/FireRed VAD.