Port: 9007
Technology: Go + gRPC
Language: Go 1.25+
Primary Database: PostgreSQL (assistant_db)
Search Engine: OpenSearch
Purpose
The Assistant API handles:- Voice assistant creation and management
- Real-time conversation streaming and management
- Audio processing (VAD, encoding/decoding)
- LLM integration and inference
- Speech-to-Text (STT) orchestration
- Text-to-Speech (TTS) orchestration
- Conversation persistence and searching
- Real-time WebSocket communication
Key Features
Assistant Management
- Create and configure voice assistants
- Version control for assistant configurations
- Tool and knowledge base integration
- Provider credential selection
- Testing and validation
Real-time Conversations
- WebSocket-based audio streaming
- Low-latency bidirectional communication
- Voice activity detection (silence handling)
- Concurrent conversation handling (1000+ concurrent)
- Session management and recovery
Audio Processing
- Audio encoding/decoding (PCM, WAV, MP3)
- Sample rate conversion (8kHz, 16kHz, 48kHz)
- Voice activity detection (VAD)
- Noise filtering and normalization
- Audio quality metrics
LLM Integration
- Support for multiple LLM providers
- Token counting and cost estimation
- Response streaming
- Tool/function calling
- Context window management
STT/TTS Orchestration
- Support for multiple STT providers
- Support for multiple TTS providers
- Provider fallback mechanisms
- Latency optimization
- Language and dialect support
Configuration
Environment Variables
Source Code Structure
Performance Optimization
Concurrency Handling
- Supports 1000+ concurrent conversations
- Non-blocking I/O with Go goroutines
- Connection pooling for database
- Message queue for heavy operations
Latency Optimization
- Streaming responses from LLM
- Parallel STT/TTS processing
- Local caching of provider responses
- Audio buffer optimization
Resource Management
- Memory pooling for buffers
- Garbage collection tuning
- Database connection limits
- Redis connection pooling
Monitoring and Observability
Metrics
Track per conversation:- Duration
- Token usage
- LLM latency
- STT/TTS latency
- Error rates
- Audio quality
Logging
Structured logs with:- Conversation ID
- Message ID
- Provider latencies
- Error details
- Audio metrics
Health Checks
Troubleshooting
WebSocket Connection Refused
LLM Provider Timeout
- Check integration-api is accessible
- Verify API keys are stored correctly
- Increase timeout in configuration
- Check network latency to provider
Audio Quality Issues
- Verify sample rate matches expected
- Check VAD threshold settings
- Monitor CPU usage for encoding
- Verify audio buffer sizes