Assistants

An assistant is the core unit of Rapida. It packages everything needed to run a production voice AI conversation — the LLM and prompt, the voice pipeline (STT, VAD, TTS), knowledge sources, tools, and deployment channels — into a single versioned object. One assistant configuration drives every channel. The same prompt, model, and voice settings that handle your inbound phone calls also power your web widget and WhatsApp deployment. Change something once and it propagates everywhere.

Assistants are version-controlled. Every prompt or model change creates a new draft version. Versions must be explicitly released — live deployments are never changed automatically.

Anatomy of an assistant

Prompt & Model

The system prompt defines persona, scope, and behaviour. The LLM provider and model (OpenAI, Anthropic, Gemini, Mistral, Bedrock, or a custom AgentKit backend) power the reasoning. Model parameters — temperature, max tokens, stop sequences — are tunable per version.

Voice Pipeline

Converts caller audio to text and assistant text back to speech. Configurable STT provider, VAD sensitivity, noise cancellation, end-of-speech detection, TTS provider, voice model, and pronunciation rules — independently tunable per deployment channel.

Knowledge Bases

One or more knowledge bases attached for retrieval-augmented generation. At call time, the assistant retrieves the most relevant document chunks and injects them as context before calling the LLM.

Tools

Functions the LLM can invoke mid-conversation: query knowledge, call external APIs, invoke endpoint LLM prompts, hold a call, or end the session. The LLM decides when to call each tool based on its description.

Deployments

Each channel — phone, web widget, web app, WhatsApp — is a separate deployment attached to the assistant. Deployments share the assistant’s brain but can have per-channel voice and experience settings.

Webhooks

Post-call webhooks fire at conversation start, completion, and failure. Deliver transcripts, metadata, and tool call results to your CRM, data warehouse, or alerting system in real time.

Post-call Analysis

Analysis pipelines run LLM prompts against completed transcripts to produce sentiment scores, intent labels, CSAT predictions, compliance flags, and custom metrics.

How a voice conversation works

Every conversation follows the same pipeline from audio-in to audio-out. Understanding this flow helps you tune latency, accuracy, and behaviour at each stage.

The EOS timeout (default 700ms) is the primary latency control between the caller finishing speaking and the assistant beginning to respond. Reduce it for snappy IVR-style interactions; increase it for conversational use cases where callers pause mid-thought.

Deployment channels

The same assistant is deployable across every channel. Each deployment is independently configured for voice settings and conversation experience while sharing the assistant’s prompt, model, and tools.

Phone

Inbound and outbound PSTN calls. Connect Twilio, Vonage, Exotel, Asterisk, or any SIP trunk. The assistant handles full voice conversations with live call transfer and session controls.

Web Widget

Embeddable voice and chat widget. Add to any web page with a script tag. Supports text, voice input, and voice output — configurable per deployment.

Web App (React SDK)

Full-featured voice integration inside your own React application. Audio streams over WebRTC directly — no relay servers, no iframes.

Conversational AI on WhatsApp Business via the Meta webhook API. Maintains multi-turn context across messages.

API / SDK

Trigger and manage assistants programmatically. Initiate calls, pass runtime variables, stream transcripts, and receive webhook events from Python, Node.js, Go, or React.

Debugger

Browser-based testing environment. Talk to your assistant before it touches production — inspect transcripts, tool calls, and LLM responses in real time.

The assistant lifecycle

1. Create — Define the assistant with an LLM provider and initial system prompt. The first version (v1) is created automatically in draft state. 2. Configure — Attach knowledge bases, add tools, tune the voice pipeline, and set up deployments. Each deployment channel has its own voice settings and experience configuration. 3. Test — Use the built-in Debugger deployment to run live conversations before any real traffic hits the assistant. Inspect transcripts, latency breakdowns, and tool invocations. 4. Release — Promote a version from draft to live. All active deployments switch to the released version immediately. 5. Monitor — Every conversation generates structured logs: full transcript, per-turn latency, tool call results, LLM token usage, and EOS timing. Webhook events and analysis pipeline outputs flow to your downstream systems. 6. Iterate — Create a new version with updated prompt or model parameters. Test in the Debugger. Release when confident. Previous versions are preserved and can be re-released for instant rollback.

Use separate assistants for distinct products or personas rather than a single assistant with complex conditional logic in the prompt. Assistants are cheap to create — isolation keeps prompts focused and version history clean.

In this section

Create an Assistant

Step-by-step guide through creation, prompt setup, model configuration, voice pipeline, and tools.

Version Control

Create, compare, and release new versions. Roll back instantly if something goes wrong.

Tools

Add knowledge retrieval, API calls, endpoint invocations, hold, and end-of-conversation tools.

Knowledge

Attach knowledge bases and tune retrieval settings for your use case.

Webhooks

Configure event-driven delivery of transcripts and call data to external systems.

Logs

Browse conversation transcripts, tool call traces, LLM token usage, and latency breakdowns for every session.

Post-call Analysis

Run LLM-powered analysis pipelines against completed conversations.

AgentKit

Replace Rapida’s built-in LLM with your own gRPC backend — LangChain, CrewAI, or custom logic.

Twilio Integration

Step-by-step guide for connecting a Twilio phone number to your assistant.

Introduction

Knowledge

LLM Endpoint

Activity and logs

External Integrations

Credentials

Workspace

Governance

Deployment Options

Assistants

Anatomy of an assistant

Prompt & Model

Voice Pipeline

Knowledge Bases

Tools

Deployments

Webhooks

Post-call Analysis

How a voice conversation works

Deployment channels

Phone

Web Widget

Web App (React SDK)

WhatsApp

API / SDK

Debugger

The assistant lifecycle

In this section

Create an Assistant

Version Control

Tools

Knowledge

Webhooks

Logs

Post-call Analysis

AgentKit

Twilio Integration

Introduction

Assistants

Knowledge

LLM Endpoint

Activity and logs

External Integrations

Credentials

Workspace

Governance

Deployment Options

​Anatomy of an assistant

Prompt & Model

Voice Pipeline

Knowledge Bases

Tools

Deployments

Webhooks

Post-call Analysis

​How a voice conversation works

​Deployment channels

Phone

Web Widget

Web App (React SDK)

WhatsApp

API / SDK

Debugger

​The assistant lifecycle

​In this section

Create an Assistant

Version Control

Tools

Knowledge

Webhooks

Logs

Post-call Analysis

AgentKit

Twilio Integration

Anatomy of an assistant

How a voice conversation works

Deployment channels

The assistant lifecycle

In this section