Skip to main content
POST
/
v1
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "stream": true,
  "temperature": 123,
  "max_tokens": 123,
  "nightly": [
    {}
  ],
  "guardrail_policy": "<string>",
  "chunk_size": 123,
  "sliding_window": 123,
  "finetune_thresh": 123,
  "min_finetune_group": 123,
  "lora_name": "<string>",
  "adaption_inference": true
}
'
{
  "id": "<string>",
  "object": "<string>",
  "choices": [
    {}
  ],
  "usage": {}
}

POST /v1/chat/completions

Route chat completion requests to OpenRouter with optional Nightly memory, Policies guardrails, and Adaption fine-tuning support. Compatible with the OpenAI SDK.

Authentication

Authorization
string
required
Bearer token with your API key. Format: Bearer YOUR_ASYMMETRIC_API_KEY

Request Body

model
string
required
The model to use (e.g., openai/gpt-4o-mini, anthropic/claude-3-sonnet, etc.)
messages
array
required
Array of message objects with role and content fields
stream
boolean
default:"false"
Whether to stream the response
temperature
number
Sampling temperature (0-2)
max_tokens
integer
Maximum tokens to generate

Nightly (Memory) Parameters

nightly
array
Enable agent memory. Format: ["memory_group", "user_goal"]
  • memory_group: Unique identifier for the memory pool
  • user_goal: Description to help curate relevant memories

Policies (Guardrails) Parameters

guardrail_policy
string
Natural language policy to check outputs against. If provided, outputs are filtered in real-time.
chunk_size
integer
default:"10"
For streaming: how often to check guardrails (every N chunks)
sliding_window
integer
default:"5"
For streaming: number of recent chunks to check together

Adaption (Fine-tuning) Parameters

finetune_thresh
integer
Memory call count threshold to trigger training queue
min_finetune_group
integer
default:"5"
Minimum memories needed before triggering training
lora_name
string
Name of the LoRA adapter to use or create
adaption_inference
boolean
default:"false"
Set to true to use your finetuned LoRA adapter for inference

Response

Returns an OpenAI-compatible chat completion response.
id
string
Unique identifier for the completion
object
string
Always chat.completion
choices
array
Array of completion choices
usage
object
Token usage statistics

Example

from openai import OpenAI

client = OpenAI(
    base_url="https://rkdune--symmetry.modal.run/v1/",
    api_key="YOUR_ASYMMETRIC_API_KEY",
)

# Basic completion with memory and guardrails
completion = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_body={
        "nightly": ["my_agent", "A helpful assistant"],
        "guardrail_policy": "Flag any inappropriate content"
    }
)

print(completion.choices[0].message.content)

Errors

StatusDescription
400Guardrail violation detected or invalid parameters
401Invalid API key
402Insufficient credits