Create response
POST
https://api.poe.com/v1/responsesOverview
Creates a model response using the OpenAI Responses API format. This is the most feature-rich endpoint, supporting advanced capabilities beyond basic chat completions.
Key features:
- Reasoning and extended thinking support
- Built-in tool calling with web search
- Structured outputs (JSON schema)
- Streaming support via SSE
- Multi-turn conversations via
previous_response_id - Multi-modal inputs (text, images)
OpenAI Responses API compatible: This endpoint follows the OpenAI Responses API format. You can use the OpenAI SDK with Poe's base URL as a drop-in replacement.
Note: This endpoint supports all models available on Poe, not just OpenAI models.
Authentication
Send your Poe API key in the Authorization header:
Authorization: Bearer sk_test_51SAMPLEKEYAll requests must be made over HTTPS.
Parameters
This endpoint does not accept query or path parameters.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Required | ID of the model to use. Use Poe bot names. Note: Poe UI-specific system prompts are skipped |
input | string | object[] | Required | The input to the model. Can be a simple string or an array of input items including text, images, and previous assistant messages. |
instructions | string | null | Optional | A system (or developer) message inserted at the beginning of the model's context |
stream | boolean | Optional | Whether to stream back partial progress as server-sent events (SSE) Default: false |
max_output_tokens | integer | null | Optional | Maximum number of output tokens to generate. Uses the model's default if not specified |
temperature | number | null | Optional | Sampling temperature between 0 and 2 Min: 0 ยท Max: 2 |
top_p | number | null | Optional | Nucleus sampling parameter Min: 0 ยท Max: 1 |
reasoning | object | null | Optional | Configuration for reasoning/thinking models. Not all models support reasoning. |
text | object | null | Optional | Configuration for structured text output |
tools | array | null | Optional | List of tools the model may call. Supports function tools and built-in tools like web_search_preview. |
tool_choice | string | object | Optional | How the model should select which tool to use. Can be "auto", "required", "none", or a specific tool object. |
parallel_tool_calls | boolean | null | Optional | Whether to allow the model to run tool calls in parallel |
truncation | string | null | Optional | The truncation strategy to use. - "auto": Truncate input to fit the model's context window- "disabled": Fail if the input exceeds the context windowAllowed values: auto, disabled |
previous_response_id | string | null | Optional | The ID of a previous response for multi-turn conversations. The model will use the previous response's context |
include | array | null | Optional | Additional output data to include in the response. Supported values: - "web_search_call.action.sources" - Include web search sources- "message.output_text.logprobs" - Include log probabilities- "reasoning.encrypted_content" - Include encrypted reasoning content |
metadata | object | null | Optional | Set of key-value pairs for storing additional information (keys max 64 chars, values max 512 chars) |
service_tier | string | null | Optional | Specifies the processing tier for the request Allowed values: auto, default, flex, priority |
store | boolean | null | Optional | Whether to store the generated response for later retrieval |
Responses
| Field | Type | Required | Description |
|---|---|---|---|
id | string | Optional | Unique identifier for the response |
object | "response" | Optional | Object type, always "response" Allowed values: response |
created_at | integer | Optional | Unix timestamp of when the response was created |
model | string | Optional | The model used to generate the response |
output | object[] | Optional | Array of output items generated by the model |
output[].type | string | Optional | Type of output item (e.g., "message") |
output[].role | string | Optional | Role of the output (e.g., "assistant") |
output[].content | object[] | Optional | Content blocks in the output |
output[].content[].type | string | Optional | Content block type (e.g., "output_text") |
output[].content[].text | string | Optional | Text content |
status | "completed" | "failed" | "in_progress" | "incomplete" | Optional | The status of the response Allowed values: completed, failed, in_progress, incomplete |
usage | object | Optional | Token usage information |
usage.input_tokens | integer | Optional | Number of input tokens |
usage.output_tokens | integer | Optional | Number of output tokens |
usage.total_tokens | integer | Optional | Total number of tokens used |
โ Error codes
| Http | Type | Description |
|---|---|---|
| 400 | invalid_request_error | Bad request Malformed JSON or missing required fields |
| 401 | authentication_error | Authentication failed Invalid API key |
| 402 | insufficient_credits | Insufficient credits Point balance is zero or negative |
| 429 | rate_limit_error | Rate limit exceeded Rate limit exceeded (500 requests per minute) |
๐ Callbacks & webhooks
No callbacks or webhooks are associated with this endpoint.