Custom LLM - rapida.ai documentation

Setup

Create the credential

Open Integrations > Models, select Custom LLM, and create a credential.

Field	Required	Description
`apiCompatibility`	Yes	Request format. Use `openai_chat_completions`, `openai_compatible`, or `openai_responses` for active chat support.
`baseUrl`	Yes	API root for your model server, for example `https://llm.example.com/v1`.
`headers`	No	Headers sent with each request, such as `Authorization: Bearer ...`.

Select the model

Open the assistant model settings, select Custom LLM, and enter the model ID your server expects.

Configure model parameters

Use Model Parameters for provider-specific JSON such as temperature, token limits, or extra server options.

Supported compatibility values

Value	Use for
`openai_chat_completions`	Servers compatible with `/v1/chat/completions`.
`openai_compatible`	OpenAI-compatible local/model servers such as vLLM, Ollama, LM Studio, or TGI.
`openai_responses`	Servers compatible with `/v1/responses`.
`anthropic_messages`	Credential option only; chat execution is not implemented yet.
`gemini_generate_content`	Credential option only; chat execution is not implemented yet.

Example

{
  "apiCompatibility": "openai_compatible",
  "baseUrl": "http://localhost:8000/v1",
  "headers": {
    "Authorization": "Bearer local-dev-key"
  }
}

{
  "model.id": "meta-llama/Llama-3.1-8B-Instruct",
  "model.name": "meta-llama/Llama-3.1-8B-Instruct",
  "model.parameters": {
    "temperature": 0.4,
    "max_tokens": 512,
    "top_p": 0.9
  }
}

Full Custom LLM reference

Credential fields, model arguments, compatibility notes, and backend mapping.

​Setup

​Supported compatibility values

​Example

Full Custom LLM reference

Setup

Supported compatibility values

Example