Reference documentation for inference endpoints
Chat completion endpoint. Accepts an OpenAI-compatible message array and returns the full completion object.
| Header | Type | Default | Description |
|---|---|---|---|
| Authorization* | string | — | Must be present and follow the format Bearer <token>. Requests without a valid authorization header are rejected with 401. |
| Field | Type | Default | Description |
|---|---|---|---|
| messages* | array | — | Array of message objects (role + content). Supports multimodal content arrays. |
| modelopt | string | — | Model name. Defaults to "Auto". |
| temperatureopt | float | 0.0 |
Sampling temperature (0–2). |
| top_popt | float | 0.95 |
Nucleus sampling probability. |
| max_tokensopt | integer | null |
Maximum tokens to generate. |
| streamopt | boolean | false |
Enable streaming responses. |
curl -X POST https://inference.zenith-ai.one/v1/chat/completions/stable \
-H "Authorization: Bearer <your-token>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain black holes in one sentence."}
],
"temperature": 0.7
}'
Standard OpenAI chat completion object.
{
"id": "chatcmpl-...",
"object": "chat.completion",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "..."},
"finish_reason": "stop"
}],
"usage": {"prompt_tokens": 20, "completion_tokens": 18, "total_tokens": 38}
}
Image generation endpoint. Supports text-to-image and image-to-image (editing/variation) via an optional images field.
Returns the raw GenerateContentResponse with inline image bytes base64-encoded.
| Header | Type | Default | Description |
|---|---|---|---|
| Authorization* | string | — | Must be present and follow the format Bearer <token>. Requests without a valid authorization header are rejected with 401. |
| Field | Type | Default | Description |
|---|---|---|---|
| prompt* | string | — | Text description of the image to generate. |
| modelopt | string | gemini-2.5-flash-image |
Must be one of: gemini-2.5-flash-image, gemini-3-pro-image-preview. |
| response_modalitiesopt | array | ["IMAGE","TEXT"] |
Modalities to return. Use ["IMAGE"] to suppress text. |
| temperatureopt | float | — | Sampling temperature. |
| top_popt | float | — | Nucleus sampling probability. |
| top_kopt | float | — | Top-k sampling cutoff. |
| seedopt | integer | — | Fixed seed for reproducible outputs. |
| max_output_tokensopt | integer | — | Maximum tokens in the response. |
| safety_settingsopt | array | — | List of SafetySetting objects to override default filters. |
| system_instructionopt | string | — | System-level instruction prepended to the conversation. |
| candidate_countopt | integer | — | Number of response candidates to generate. |
| stop_sequencesopt | array | — | Strings that stop generation when encountered. |
| imagesopt | array | [] |
Input images for image-to-image generation. Each item is either: • An object {"data": "<base64>", "mime_type": "image/png"}• A data-URI string "data:image/png;base64,<base64>"Supported MIME types: image/png, image/jpeg, image/gif, image/webp.
Invalid base64 returns 400.
|
curl -X POST https://inference.zenith-ai.one/v1/image/generation/stable \
-H "Authorization: Bearer <your-token>" \
-H "Content-Type: application/json" \
-d '{"prompt": "a futuristic city at sunset, photorealistic"}'
curl -X POST https://inference.zenith-ai.one/v1/image/generation/stable \
-H "Authorization: Bearer <your-token>" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Change the sky to a dramatic purple sunset",
"images": [
{"data": "<base64-encoded PNG>", "mime_type": "image/png"}
]
}'
Or using a data-URI:
{
"prompt": "Make the background a beach",
"images": ["data:image/jpeg;base64,<base64-encoded JPEG>"]
}
Raw Google GenerateContentResponse. Image bytes are base64-encoded inside candidates[].content.parts[].inline_data.data.
{
"candidates": [{
"content": {
"role": "model",
"parts": [
{
"inline_data": {
"data": "<base64-encoded image bytes>",
"mime_type": "image/png"
}
},
{
"text": "Here is the generated image."
}
]
},
"finish_reason": "STOP"
}],
"usage_metadata": { "prompt_token_count": 9, "candidates_token_count": 0 }
}
import requests, base64
# Text-to-image
resp = requests.post(
"https://inference.zenith-ai.one/v1/image/generation/stable",
headers={"Authorization": "Bearer <your-token>"},
json={"prompt": "a cat sitting on a cloud"},
).json()
# Image-to-image (pass base64-encoded source image)
with open("source.png", "rb") as f:
b64_image = base64.b64encode(f.read()).decode()
resp = requests.post(
"https://inference.zenith-ai.one/v1/image/generation/stable",
headers={"Authorization": "Bearer <your-token>"},
json={
"prompt": "make the background a snowy mountain",
"images": [{"data": b64_image, "mime_type": "image/png"}],
},
).json()
for i, candidate in enumerate(resp.get("candidates", [])):
for j, part in enumerate(candidate["content"]["parts"]):
if part.get("inline_data"):
img_bytes = base64.b64decode(part["inline_data"]["data"])
open(f"image_{i}_{j}.png", "wb").write(img_bytes)
elif part.get("text"):
print("Text:", part["text"])