Google Speech Service

Supported Models

Google Speech Service offers various models optimized for different use cases:

Speech-to-Text Models

Model Name	Description	Best For
default	Google’s standard speech recognition model	General purpose speech recognition
enhanced	Enhanced model with better accuracy	Noisy environments and accents
phone_call	Optimized for phone call audio	Telephony applications
video	Optimized for video audio	Video content transcription

Languages Supported

Google Speech Service supports 125+ languages. Some popular ones include:

English (US, UK, Australia, etc.)

Spanish, French, German, Italian

Mandarin, Cantonese, Japanese, Korean

Hindi, Portuguese, Russian, Arabic

Setting Up Provider Credentials

Access the Integrations Page

Navigate to the “Integration > Models” page. Here you’ll see a grid of various speech service providers.

Select Google Speech Service

On the Integrations page, find the Google Speech Service provider card.Click the “Setup Credential” button for Google Speech Service.

Create Provider Credential

A modal window will appear titled “Create provider credential”. Follow these steps:

Select “Google Speech Service” from the dropdown (if not already selected)
Enter a Key Name: Assign a unique name to this provider key for easy identification
Enter Project ID: Input your Google Cloud Project ID
Enter Service Account Key: Paste the entire JSON content of your service account key
Click “Configure” to save the credential

Verify Credential Setup

After setting up the credential, you can verify it’s been added:

The Google Speech Service provider card should now show “Connected”
If you click on the provider, you’ll see a “View provider credential” modal
This modal displays the credential name, when it was last updated, and options to delete or close

Your Google Speech Service provider credential is now set up for speech-to-text integration.

Integration Features

High Accuracy: Industry-leading speech recognition accuracy

Multiple Languages: Support for 125+ languages and variants

Real-time Streaming: Stream audio for real-time transcription

Speaker Diarization: Identify different speakers in audio

Word-level Confidence: Get confidence scores for transcribed words

Custom Vocabularies: Define custom words and phrases

Assistants

Knowledge

LLM Endpoint

Activity & Logs

External Integrations

Credentials

Workspace

Governance

Deployment Options

Getting Started

Supported Models

Speech-to-Text Models

Languages Supported

Prerequisites

Setting Up Provider Credentials

Integration Features

​Getting Started

​Supported Models

​Speech-to-Text Models

​Languages Supported

​Prerequisites

​Setting Up Provider Credentials

​Integration Features

Getting Started

Supported Models

Speech-to-Text Models

Languages Supported

Prerequisites

Setting Up Provider Credentials

Integration Features