Skip to main content

Documentation Index

Fetch the complete documentation index at: https://doc.rapida.ai/llms.txt

Use this file to discover all available pages before exploring further.

The voice pipeline controls how Rapida captures user audio, prepares it for transcription, detects when the user has finished speaking, and speaks the assistant response back to the user. Use this overview to understand the full flow. Use the dedicated pages in this section when you need to tune one part of the pipeline.
Voice input and voice output are configured per deployment. The same assistant can use different voice settings for Phone Call, Web Widget, and Web App / SDK deployments.

Configure it

Open your assistant, select Configure Assistant, then open Deployments. Voice settings appear inside each deployment that supports audio.
DeploymentVoice inputVoice outputNotes
Phone CallRequiredRequiredCaller audio and assistant speech are both required for live calls.
Web WidgetOptionalOptionalCan run as text-only, voice-input-only, voice-output-only, or full voice.
Web App / SDKOptionalOptionalYour application controls the UI while Rapida handles the audio pipeline.
WhatsAppNot usedNot usedWhatsApp uses text messages, not the voice pipeline.

Pipeline components

Noise Cancellation

Clean background noise before VAD and STT process the user’s audio.

Speech-to-Text

Choose the provider, credential, model, and language used to transcribe user speech.

Text-to-Speech

Choose the provider, voice, model, language, pronunciation, and speech delivery settings.

Voice Activity Detection

Tune speech detection, silence frames, and barge-in sensitivity.

End of Speech Detection

Decide when the user has finished a turn and the assistant should respond.
AreaStart with
STTA streaming provider and model that matches your channel audio.
Noise cancellationRNNoise enabled for phone calls and noisy browser environments.
VADSilero VAD.
EOSPipecat Smart Turn for natural conversations, or Silence-Based for simple IVR-style flows.
TTSA low-latency streaming voice that supports the assistant’s primary language.
PromptShort spoken responses, usually one or two sentences.
Tune the pipeline from real conversation logs. If a caller gets cut off, start with EOS and VAD. If transcription is wrong, check language, audio quality, noise cancellation, and STT model. If the assistant feels slow, check EOS timeout, LLM latency, and TTS latency.

Troubleshooting map

SymptomFirst place to look
Assistant responds before the user is doneEnd of Speech Detection
Assistant interrupts on coughs or background noiseVoice Activity Detection and Noise Cancellation
Transcript is wrong or incompleteSpeech-to-Text
Assistant voice is slow, unnatural, or mispronounces termsText-to-Speech
Phone calls behave differently from web sessionsDeployment-level voice input/output settings

Phone Call Deployment

Configure required voice input and output for phone calls.

Web Widget Deployment

Add optional microphone input and spoken responses to the website widget.

Web App / SDK Deployment

Build a custom voice interface while Rapida handles the audio pipeline.

Create an Assistant

Create the assistant before configuring deployment voice settings.