integration-api abstracts all LLM providers behind a single LargeLanguageCaller interface. Provider credentials are encrypted at rest using INTEGRATION_CRYPTO_KEY. The assistant-api calls integration-api via gRPC to execute LLM inference.
LargeLanguageCaller Interface
Every LLM provider implements:
ChatCompletionOptions
Model behaviour is controlled via the ModelParameter map:
| Key | Type | Description |
|---|---|---|
model.name | string | Model ID (e.g. gpt-4o, claude-sonnet-4-6) |
model.temperature | float | Sampling temperature (0.0–2.0) |
model.max_tokens | int | Maximum tokens in the response |
model.top_p | float | Nucleus sampling threshold |
model.stop | []string | Stop sequences |
model.tool_choice | string | Tool usage: auto, none, required |
model.frequency_penalty | float | Repeat penalty (OpenAI / Gemini) |
model.presence_penalty | float | Topic penalty (OpenAI / Gemini) |
model.seed | int | Deterministic sampling seed |
model.response_format | string | Output format (e.g. json_object) |
Supported LLM Providers
| Provider | Directory | Notes |
|---|---|---|
| OpenAI | caller/openai/ | GPT-4o, GPT-4o-mini, GPT-4-turbo |
| Anthropic | caller/anthropic/ | Claude Opus/Sonnet/Haiku |
| Google Gemini | caller/gemini/ | Gemini 2.0, 1.5 Pro |
| Azure OpenAI | caller/azure/ | Azure-hosted OpenAI deployments |
| Mistral | caller/mistral/ | Mistral Large, Small, Nemo |
| Cohere | caller/cohere/ | Command R, Command R+ |
| Vertex AI | caller/vertexai/ | Google Vertex AI models |
| HuggingFace | caller/huggingface/ | HuggingFace Inference API |
| Replicate | caller/replicate/ | Replicate hosted models |
| Voyage AI | caller/voyageai/ | Embedding and reranking |
Credential Encryption
All vault credentials are encrypted withINTEGRATION_CRYPTO_KEY before storage. Never commit this key or expose it in logs.