Getting Started
To integrate Cartesia with your Rapida application for speech-to-text (STT) and text-to-speech (TTS) capabilities, follow these steps:Supported Models
Speech-to-Text Models
| Model Name | Language | Description |
|---|---|---|
| Sonic English | English | High-accuracy English speech recognition |
| Sonic Multilingual | Multilingual | Support for multiple languages |
Text-to-Speech Models
| Model Name | Features | Best For |
|---|---|---|
| Bark | Expressive speech synthesis | Natural-sounding voices with emotion |
| Tortoise | High-quality TTS | Professional voice applications |
| Sonic XL | Ultra-realistic voices | High-fidelity voice synthesis |
Supported Languages for STT
- English (US, UK, Australian variants)
- Spanish, French, German
- Mandarin, Japanese, Korean
- And more
Supported Languages for TTS
Cartesia supports 20+ languages for text-to-speech synthesis with multiple voice options.Prerequisites
- Have a Cartesia account (sign up at https://cartesia.ai)
- Navigate to your API dashboard
- Generate an API key
- Copy the API key (make sure to save it securely)
Setting Up Provider Credentials
Access the Integrations Page

Select Cartesia
On the Integrations page, find the Cartesia provider card.Click the “Setup Credential” button for Cartesia.
Create Provider Credential
A modal window will appear titled “Create provider credential”. Follow these steps:
- Select “Cartesia” from the dropdown (if not already selected)
- Enter a Key Name: Assign a unique name to this provider key for easy identification
- Enter the API Key: Input your Cartesia API key
- Click “Configure” to save the credential
Verify Credential Setup
After setting up the credential, you can verify it’s been added:
- The Cartesia provider card should now show “Connected”
- If you click on the provider, you’ll see a “View provider credential” modal
- This modal displays the credential name, when it was last updated, and options to delete or close
Integration Features
- Unified Platform: Both STT and TTS in one platform
- High-Quality Audio: Professional-grade voice synthesis
- Real-time Processing: Low-latency speech processing
- Multiple Languages: Comprehensive language support
- Voice Customization: Create custom voices and speaking styles
- Streaming Support: Real-time streaming for both STT and TTS