Llm
For instructions on how to authenticate to use this endpoint, see API overview.
Create a chat completion using OpenAI or compatible models. Follows OpenAI's Chat Completions API format.
Endpoints
POST | |
POST |
Create llm gateway v1 chat completions
Required API key scopes
task:writePath parameters
- project_idstring
Project ID of the project you're trying to access. To find the ID of the project, make a call to /api/projects/.
Query parameters
- formatstringOne of:
"json""txt"
Request parameters
- modelstring
The model to use for completion (e.g., 'gpt-4', 'gpt-3.5-turbo')
- messagesarray
List of message objects with 'role' and 'content'
- temperaturenumber
Sampling temperature between 0 and 2
- top_pnumber
Nucleus sampling parameter
- ninteger
Number of completions to generate
- streambooleanDefault:
falseWhether to stream the response
- stream_options
Additional options for streaming
- stoparray
Stop sequences
- max_tokensinteger
Maximum number of tokens to generate
- max_completion_tokensinteger
Maximum number of completion tokens (alternative to max_tokens)
- presence_penaltynumber
Presence penalty between -2.0 and 2.0
- frequency_penaltynumber
Frequency penalty between -2.0 and 2.0
- logit_bias
Logit bias mapping
- userstring
Unique user identifier
- toolsarray
List of tools available to the model
- tool_choice
Controls which tool is called
- parallel_tool_callsboolean
Whether to allow parallel tool calls
- response_format
Format for the model output
- seedinteger
Random seed for deterministic sampling
- logprobsboolean
Whether to return log probabilities
- top_logprobsinteger
Number of most likely tokens to return at each position
- modalitiesarray
Output modalities
- prediction
Prediction content for speculative decoding
- audio
Audio input parameters
- reasoning_effort
Reasoning effort level for o-series models
none- noneminimal- minimallow- lowmedium- mediumhigh- highdefault- default
- verbosity
Controls the verbosity level of the model's output
concise- concisestandard- standardverbose- verbose
- storeboolean
Whether to store the output for model distillation or evals
- web_search_options
Web search tool configuration
- functionsarray
Deprecated in favor of tools. List of functions the model may call
- function_call
Deprecated in favor of tool_choice. Controls which function is called
Response
Example request
POST /api /projects /:project_id /llm_gateway /v1 /chat /completionsExample response
Status 200 Successful response with chat completion
Status 400 Invalid request parameters
Status 500 Internal server error
Create llm gateway v1 messages
Create a message using Anthropic's Claude models. Compatible with Anthropic's Messages API format.
Required API key scopes
task:writePath parameters
- project_idstring
Project ID of the project you're trying to access. To find the ID of the project, make a call to /api/projects/.
Query parameters
- formatstringOne of:
"json""txt"
Request parameters
- modelstring
The model to use for completion (e.g., 'claude-3-5-sonnet-20241022')
- messagesarray
List of message objects with 'role' and 'content'
- max_tokensintegerDefault:
4096Maximum number of tokens to generate
- temperaturenumber
Sampling temperature between 0 and 1
- top_pnumber
Nucleus sampling parameter
- top_kinteger
Top-k sampling parameter
- streambooleanDefault:
falseWhether to stream the response
- stop_sequencesarray
Custom stop sequences
- system
System prompt (string or array of content blocks)
- metadata
Metadata to attach to the request
- thinking
Thinking configuration for extended thinking
- toolsarray
List of tools available to the model
- tool_choice
Controls which tool is called
- service_tier
Service tier for the request
auto- autostandard_only- standard_only
Response
Example request
POST /api /projects /:project_id /llm_gateway /v1 /messages