Getting Started
To integrate Google Speech Service with your Rapida application for speech-to-text (STT) capabilities, follow these steps:Supported Models
Google Speech Service offers various models optimized for different use cases:Speech-to-Text Models
| Model Name | Description | Best For |
|---|---|---|
| default | Google’s standard speech recognition model | General purpose speech recognition |
| enhanced | Enhanced model with better accuracy | Noisy environments and accents |
| phone_call | Optimized for phone call audio | Telephony applications |
| video | Optimized for video audio | Video content transcription |
Languages Supported
Google Speech Service supports 125+ languages. Some popular ones include:- English (US, UK, Australia, etc.)
- Spanish, French, German, Italian
- Mandarin, Cantonese, Japanese, Korean
- Hindi, Portuguese, Russian, Arabic
Prerequisites
- Have a Google Cloud account
- Create a new Google Cloud Project or use an existing one
- Enable the Speech-to-Text API in your Google Cloud Project
- Create a service account with Speech-to-Text permissions
- Download the service account key (JSON file)
Setting Up Provider Credentials
Access the Integrations Page

Select Google Speech Service
On the Integrations page, find the Google Speech Service provider card.Click the “Setup Credential” button for Google Speech Service.
Create Provider Credential
A modal window will appear titled “Create provider credential”. Follow these steps:
- Select “Google Speech Service” from the dropdown (if not already selected)
- Enter a Key Name: Assign a unique name to this provider key for easy identification
- Enter Project ID: Input your Google Cloud Project ID
- Enter Service Account Key: Paste the entire JSON content of your service account key
- Click “Configure” to save the credential
Verify Credential Setup
After setting up the credential, you can verify it’s been added:
- The Google Speech Service provider card should now show “Connected”
- If you click on the provider, you’ll see a “View provider credential” modal
- This modal displays the credential name, when it was last updated, and options to delete or close
Integration Features
- High Accuracy: Industry-leading speech recognition accuracy
- Multiple Languages: Support for 125+ languages and variants
- Real-time Streaming: Stream audio for real-time transcription
- Speaker Diarization: Identify different speakers in audio
- Word-level Confidence: Get confidence scores for transcribed words
- Custom Vocabularies: Define custom words and phrases