OpenAI Responses API

The Poe API supports the OpenAI Responses API format, providing advanced capabilities beyond the standard Chat Completions API. Use the same OpenAI SDK you already know, just point it at Poe.

Key benefits over Chat Completions:

  • Built-in reasoning and extended thinking support via reasoning
  • Web search as a built-in tool (web_search_preview)
  • Structured outputs with JSON schema via text.format
  • Multi-turn conversations via previous_response_id (no need to resend full message history)
  • Access hundreds of AI models through a single Poe API key

If you don't need these advanced features, the OpenAI Compatible API (Chat Completions) is simpler and works with all models.

Using the OpenAI SDK

# pip install openai
import os, openai

client = openai.OpenAI(
    api_key=os.getenv("POE_API_KEY"),  # https://poe.com/api_key
    base_url="https://api.poe.com/v1",
)

response = client.responses.create(
    model="Claude-Sonnet-4.5",  # or other models (GPT-5.2, Gemini-3-Pro, Grok-4..)
    input="What are the top 3 things to do in NYC?",
)

print(response.output_text)
// npm install openai
import OpenAI from "openai";

const client = new OpenAI({
    apiKey: process.env.POE_API_KEY,  // https://poe.com/api_key
    baseURL: "https://api.poe.com/v1",
});

const response = await client.responses.create({
    model: "Claude-Sonnet-4.5",  // or other models (GPT-5.2, Gemini-3-Pro, Grok-4..)
    input: "What are the top 3 things to do in NYC?",
});

console.log(response.output_text);
curl "https://api.poe.com/v1/responses" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $POE_API_KEY" \
    -d '{
        "model": "Claude-Sonnet-4.5",
        "input": "What are the top 3 things to do in NYC?"
    }'

Streaming

Stream responses in real time using server-sent events (SSE):

import os, openai

client = openai.OpenAI(
    api_key=os.getenv("POE_API_KEY"),
    base_url="https://api.poe.com/v1",
)

stream = client.responses.create(
    model="Claude-Sonnet-4.5",
    input="Tell me about San Francisco",
    stream=True,
)

for event in stream:
    if event.type == "response.output_text.delta":
        print(event.delta, end="", flush=True)
import OpenAI from "openai";

const client = new OpenAI({
    apiKey: process.env.POE_API_KEY,
    baseURL: "https://api.poe.com/v1",
});

const stream = await client.responses.create({
    model: "Claude-Sonnet-4.5",
    input: "Tell me about San Francisco",
    stream: true,
});

for await (const event of stream) {
    if (event.type === "response.output_text.delta") {
        process.stdout.write(event.delta);
    }
}
curl "https://api.poe.com/v1/responses" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $POE_API_KEY" \
    -d '{
        "model": "Claude-Sonnet-4.5",
        "input": "Tell me about San Francisco",
        "stream": true
    }' \
    --no-buffer

Streaming event types:

EventDescription
response.createdResponse object created
response.in_progressModel is generating
response.output_text.deltaText chunk received
response.output_text.doneText output complete
response.function_call_arguments.deltaTool call argument chunk
response.function_call_arguments.doneTool call arguments complete
response.web_search_call.completedWeb search finished
response.reasoning_summary_text.deltaReasoning summary chunk
response.completedFull response complete with usage

Reasoning

Enable extended thinking for complex tasks that benefit from step-by-step reasoning:

import os, openai

client = openai.OpenAI(
    api_key=os.getenv("POE_API_KEY"),
    base_url="https://api.poe.com/v1",
)

response = client.responses.create(
    model="Claude-Sonnet-4.5",
    input="Solve step by step: if a train leaves at 3pm going 60mph and another at 4pm going 90mph, when do they meet?",
    reasoning={
        "effort": "high",    # Options: "low", "medium", "high"
        "summary": "auto",   # Options: "auto", "concise", "detailed"
    },
)

print(response.output_text)
import OpenAI from "openai";

const client = new OpenAI({
    apiKey: process.env.POE_API_KEY,
    baseURL: "https://api.poe.com/v1",
});

const response = await client.responses.create({
    model: "Claude-Sonnet-4.5",
    input: "Solve step by step: if a train leaves at 3pm going 60mph and another at 4pm going 90mph, when do they meet?",
    reasoning: {
        effort: "high",
        summary: "auto",
    },
});

console.log(response.output_text);
curl "https://api.poe.com/v1/responses" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $POE_API_KEY" \
    -d '{
        "model": "Claude-Sonnet-4.5",
        "input": "Solve step by step: if a train leaves at 3pm going 60mph and another at 4pm going 90mph, when do they meet?",
        "reasoning": {
            "effort": "high",
            "summary": "auto"
        }
    }'

Note: Not all models support reasoning. Models with built-in reasoning capabilities (such as Claude Sonnet 4.5, o3, o4-mini) work best with this parameter.

Use the built-in web_search_preview tool to let the model search the web for up-to-date information:

import os, openai

client = openai.OpenAI(
    api_key=os.getenv("POE_API_KEY"),
    base_url="https://api.poe.com/v1",
)

response = client.responses.create(
    model="GPT-5.2",
    input="What are the latest AI news today?",
    tools=[{"type": "web_search_preview"}],
)

print(response.output_text)
import OpenAI from "openai";

const client = new OpenAI({
    apiKey: process.env.POE_API_KEY,
    baseURL: "https://api.poe.com/v1",
});

const response = await client.responses.create({
    model: "GPT-5.2",
    input: "What are the latest AI news today?",
    tools: [{ type: "web_search_preview" }],
});

console.log(response.output_text);
curl "https://api.poe.com/v1/responses" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $POE_API_KEY" \
    -d '{
        "model": "GPT-5.2",
        "input": "What are the latest AI news today?",
        "tools": [{"type": "web_search_preview"}]
    }'

To include web search source URLs in the response, add "web_search_call.action.sources" to the include parameter:

response = client.responses.create(
    model="GPT-5.2",
    input="What are the latest AI news today?",
    tools=[{"type": "web_search_preview"}],
    include=["web_search_call.action.sources"],
)

Structured Outputs

Get responses in a specific JSON schema format using the text parameter:

import os, openai

client = openai.OpenAI(
    api_key=os.getenv("POE_API_KEY"),
    base_url="https://api.poe.com/v1",
)

response = client.responses.create(
    model="GPT-5.2",
    input="List the top 3 programming languages in 2025",
    text={
        "format": {
            "type": "json_schema",
            "name": "languages",
            "schema": {
                "type": "object",
                "properties": {
                    "languages": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "reason": {"type": "string"},
                            },
                            "required": ["name", "reason"],
                        },
                    }
                },
                "required": ["languages"],
            },
        }
    },
)

print(response.output_text)
import OpenAI from "openai";

const client = new OpenAI({
    apiKey: process.env.POE_API_KEY,
    baseURL: "https://api.poe.com/v1",
});

const response = await client.responses.create({
    model: "GPT-5.2",
    input: "List the top 3 programming languages in 2025",
    text: {
        format: {
            type: "json_schema",
            name: "languages",
            schema: {
                type: "object",
                properties: {
                    languages: {
                        type: "array",
                        items: {
                            type: "object",
                            properties: {
                                name: { type: "string" },
                                reason: { type: "string" },
                            },
                            required: ["name", "reason"],
                        },
                    },
                },
                required: ["languages"],
            },
        },
    },
});

console.log(response.output_text);
curl "https://api.poe.com/v1/responses" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $POE_API_KEY" \
    -d '{
        "model": "GPT-5.2",
        "input": "List the top 3 programming languages in 2025",
        "text": {
            "format": {
                "type": "json_schema",
                "name": "languages",
                "schema": {
                    "type": "object",
                    "properties": {
                        "languages": {
                            "type": "array",
                            "items": {
                                "type": "object",
                                "properties": {
                                    "name": {"type": "string"},
                                    "reason": {"type": "string"}
                                },
                                "required": ["name", "reason"]
                            }
                        }
                    },
                    "required": ["languages"]
                }
            }
        }
    }'

Multi-turn Conversations

Use previous_response_id to continue a conversation without resending the full message history:

import os, openai

client = openai.OpenAI(
    api_key=os.getenv("POE_API_KEY"),
    base_url="https://api.poe.com/v1",
)

# First message
response = client.responses.create(
    model="Claude-Sonnet-4.5",
    input="What is the capital of France?",
)
print(response.output_text)  # "The capital of France is Paris."

# Follow-up message using previous_response_id
followup = client.responses.create(
    model="Claude-Sonnet-4.5",
    input="What is its population?",
    previous_response_id=response.id,
)
print(followup.output_text)  # "Paris has a population of approximately 2.1 million..."
import OpenAI from "openai";

const client = new OpenAI({
    apiKey: process.env.POE_API_KEY,
    baseURL: "https://api.poe.com/v1",
});

// First message
const response = await client.responses.create({
    model: "Claude-Sonnet-4.5",
    input: "What is the capital of France?",
});
console.log(response.output_text);

// Follow-up message using previous_response_id
const followup = await client.responses.create({
    model: "Claude-Sonnet-4.5",
    input: "What is its population?",
    previous_response_id: response.id,
});
console.log(followup.output_text);

System Instructions

Use the instructions parameter to set system-level instructions:

response = client.responses.create(
    model="Claude-Sonnet-4.5",
    instructions="You are a helpful travel agent. Always suggest local food recommendations.",
    input="Plan a weekend trip to Tokyo",
)
print(response.output_text)
curl "https://api.poe.com/v1/responses" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $POE_API_KEY" \
    -d '{
        "model": "Claude-Sonnet-4.5",
        "instructions": "You are a helpful travel agent. Always suggest local food recommendations.",
        "input": "Plan a weekend trip to Tokyo"
    }'

Tool Calling

The Responses API supports custom function tools, similar to the Chat Completions API. Define tools and let the model decide when to call them:

import os, openai

client = openai.OpenAI(
    api_key=os.getenv("POE_API_KEY"),
    base_url="https://api.poe.com/v1",
)

response = client.responses.create(
    model="GPT-5.2",
    input="What's the weather in Boston?",
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather in a location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        }
                    },
                    "required": ["location"],
                },
            },
        }
    ],
)

# Check for tool calls in the response output
for item in response.output:
    if item.type == "function_call":
        print(f"Tool call: {item.name}({item.arguments})")

For a complete walkthrough of tool calling patterns including agentic loops, see the Tool Calling guide.

Detailed API Support

Request Fields

FieldSupport Status
modelUse Poe bot names (e.g. Claude-Sonnet-4.5, GPT-5.2)
inputFully supported (string or array of input items)
streamFully supported
instructionsFully supported
reasoningFully supported (effort: low/medium/high, summary: auto/concise/detailed)
textFully supported (structured outputs with json_schema)
toolsFully supported (function tools and built-in tools like web_search_preview)
tool_choiceFully supported (auto, required, none, or specific tool)
parallel_tool_callsFully supported
max_output_tokensFully supported
temperatureFully supported (between 0 and 2)
top_pFully supported
truncationFully supported (auto or disabled)
previous_response_idFully supported
includeFully supported
metadataPassed through
storePassed through
service_tierPassed through
top_logprobsFully supported (0-20)

Response Fields

FieldSupport Status
idFully supported
objectAlways "response"
created_atFully supported
modelFully supported
outputFully supported (array of output items)
output[].type"message", "function_call", "web_search_call", etc.
output[].content[].textFully supported
statuscompleted, failed, in_progress, incomplete
usage.input_tokensFully supported
usage.output_tokensFully supported
usage.total_tokensFully supported

Known Issues & Limitations

Bot Availability

  • Private bots are not currently supported - Only public bots can be accessed through the API
  • The App-Creator and Script-Bot-Creator bots are not available via the API

Feature Support

  • Not all models support all features - Reasoning works best with models that have built-in thinking capabilities (Claude Sonnet 4.5, o3, o4-mini). Web search availability depends on the underlying model provider.
  • Best-effort parameter passing - We make our best attempts to pass down parameters where possible, but some model-specific parameters may not be fully supported across all bots.

Media Bots

  • Image, video, and audio bots should be called with stream=False for optimal performance and reliability.

Error Handling

Error responses follow the same format as the OpenAI Compatible API:

{
  "error": {
    "code": 401,
    "type": "authentication_error",
    "message": "Invalid API key",
    "metadata": {}
  }
}
HTTP CodeTypeWhen It Happens
400invalid_request_errorMalformed JSON, missing required fields
401authentication_errorBad or expired API key
402insufficient_creditsPoint balance is zero
429rate_limit_errorRate limit exceeded (500 rpm)
500provider_errorProvider-side issues

Retry tips:

  • Respect the Retry-After header on 429 responses
  • Use exponential backoff starting at 250ms with jitter

Migration from Chat Completions

Already using the OpenAI Compatible API (Chat Completions)? Here's how to migrate:

Chat CompletionsResponses API
client.chat.completions.create()client.responses.create()
POST /v1/chat/completionsPOST /v1/responses
messages: [{"role": "user", "content": "..."}]input: "..."
response.choices[0].message.contentresponse.output_text
extra_body={"reasoning_effort": "high"}reasoning={"effort": "high"}
N/Atools=[{"type": "web_search_preview"}]
N/Aprevious_response_id=response.id
N/Atext={"format": {"type": "json_schema", ...}}

Quick migration steps:

  1. Change client.chat.completions.create() to client.responses.create()
  2. Replace messages=[...] with input="..." (or an array of input items)
  3. Read results from response.output_text instead of response.choices[0].message.content
  4. Optionally adopt new features: reasoning, tools, previous_response_id, text

Pricing & Availability

All Poe subscribers can use their existing subscription points with the API at no additional cost.

This means you can seamlessly transition between the web interface and API without worrying about separate billing structures or additional fees. Your regular monthly point allocation works exactly the same way whether you're chatting directly on Poe or accessing bots programmatically through the API.

If your Poe subscription is not enough, you can now purchase add-on points to get as much access as your application requires. Our intent in pricing these points is to charge the same amount for model access that underlying model providers charge. Any add-on points you purchase can be used with any model or bot on Poe and work across both the API and Poe chat on web, iOS, Android, Mac, and Windows.

Support

Feel free to reach out to support if you come across some unexpected behavior when using our API or have suggestions for future improvements.