Use this file to discover all available pages before exploring further.
AgentKit lets you run your own LLM logic on your own server. Instead of using Rapida’s built-in LLM endpoint routing, Rapida calls your gRPC service in real time — streaming user speech transcripts to your server, which streams back assistant responses that are synthesized to audio and played to the caller.This gives you complete control over:
User speaks ↓ (Rapida transcribes via STT)Your AgentKit server receives TalkInput (user text) ↓ (your code calls OpenAI / Anthropic / custom model)Your server streams back TalkOutput (assistant text chunks) ↓ (Rapida synthesizes via TTS and plays to user)User hears the response
The protocol is a bidirectional gRPC stream. Rapida manages the audio pipeline (VAD, STT, TTS, telephony) — your server only handles text in / text out.
In the Rapida dashboard, when configuring your assistant’s LLM provider, select AgentKit and enter the address of your server (e.g. my-server.example.com:50051).
Rapida can execute tools on your behalf (knowledge retrieval, endpoint invocation) and send results back to your server. Here’s how to request a tool call and receive its result:
for request in request_iterator: if self.is_configuration_request(request): yield self.configuration_response(request.configuration) continue user_text = self.get_user_text(request) msg_id = self.get_message_id(request) # Request a tool call from Rapida yield self.tool_call( msg_id=msg_id, tool_id="tool_001", name="get_weather", args={"location": "London"} ) # The next message will contain the tool result tool_result_request = next(request_iterator) # Process tool result and continue...