Skip to main content

Documentation Index

Fetch the complete documentation index at: https://doc.rapida.ai/llms.txt

Use this file to discover all available pages before exploring further.

Use Custom LLM when your model is served from your own endpoint or from a provider that follows an OpenAI-compatible API shape.

Setup

1

Create the credential

Open Integrations > Models, select Custom LLM, and create a credential.
FieldRequiredDescription
apiCompatibilityYesRequest format. Use openai_chat_completions, openai_compatible, or openai_responses for active chat support.
baseUrlYesAPI root for your model server, for example https://llm.example.com/v1.
headersNoHeaders sent with each request, such as Authorization: Bearer ....
2

Select the model

Open the assistant model settings, select Custom LLM, and enter the model ID your server expects.
3

Configure model parameters

Use Model Parameters for provider-specific JSON such as temperature, token limits, or extra server options.

Supported compatibility values

ValueUse for
openai_chat_completionsServers compatible with /v1/chat/completions.
openai_compatibleOpenAI-compatible local/model servers such as vLLM, Ollama, LM Studio, or TGI.
openai_responsesServers compatible with /v1/responses.
anthropic_messagesCredential option only; chat execution is not implemented yet.
gemini_generate_contentCredential option only; chat execution is not implemented yet.

Example

{
  "apiCompatibility": "openai_compatible",
  "baseUrl": "http://localhost:8000/v1",
  "headers": {
    "Authorization": "Bearer local-dev-key"
  }
}
{
  "model.id": "meta-llama/Llama-3.1-8B-Instruct",
  "model.name": "meta-llama/Llama-3.1-8B-Instruct",
  "model.parameters": {
    "temperature": 0.4,
    "max_tokens": 512,
    "top_p": 0.9
  }
}

Full Custom LLM reference

Credential fields, model arguments, compatibility notes, and backend mapping.