Create chat completion

POSThttps://api.poe.com/v1/chat/completions

Overview

Creates a chat completion response for the given conversation.

Features:

Streaming support
Tool calling (function calling)
Multi-modal inputs (text, images)
OpenAI-compatible format

Important notes:

Private bots are not currently supported
Image/video/audio bots should use stream: false for best results
Custom parameters require the Poe Python SDK

Authentication

Send your Poe API key in the Authorization header:

Authorization: Bearer sk_test_51SAMPLEKEY

All requests must be made over HTTPS.

Parameters

This endpoint does not accept query or path parameters.

Request body

Field	Type	Required	Description
`model`	string	Required	ID of the model to use. Use Poe bot names. Note: Poe UI-specific system prompts are skipped
`messages`	object[]	Required	A list of messages comprising the conversation so far
`messages[].role`	"system" \| "user" \| "assistant" \| "tool"	Required	The role of the message author Allowed values: `system`, `user`, `assistant`, `tool`
`messages[].content`	string \| object[]	Optional	The contents of the message
`messages[].name`	string	Optional	The name of the author of this message
`messages[].tool_calls`	object[]	Optional	Tool calls generated by the model
`messages[].tool_call_id`	string	Optional	Tool call that this message is responding to
`max_tokens`	integer \| null	Optional	Maximum number of tokens to generate
`max_completion_tokens`	integer \| null	Optional	Maximum number of completion tokens to generate
`temperature`	number \| null	Optional	Sampling temperature between 0 and 2 Min: 0 · Max: 2
`top_p`	number \| null	Optional	Nucleus sampling parameter Min: 0 · Max: 1
`stream`	boolean	Optional	Whether to stream back partial progress Default: `false`
`stream_options`	object \| null	Optional	Options for streaming
`stop`	string \| string[]	Optional	Up to 4 sequences where the API will stop generating
`tools`	array \| null	Optional	List of tools the model may call
`tool_choice`	string \| object	Optional	Controls which (if any) function is called by the model
`parallel_tool_calls`	boolean \| null	Optional	Whether to enable parallel function calling
`n`	1	Optional	Number of chat completion choices to generate (must be 1) Default: `1` · Allowed values: `1`

Responses

Field	Type	Required	Description
`id`	string	Optional	Unique identifier for the chat completion
`object`	"chat.completion"	Optional	Allowed values: `chat.completion`
`created`	integer	Optional	Unix timestamp
`model`	string	Optional	The model used
`choices`	object[]	Optional
`choices[].index`	integer	Optional	The index of this choice
`choices[].message`	object	Optional
`choices[].message.role`	"system" \| "user" \| "assistant" \| "tool"	Required	The role of the message author Allowed values: `system`, `user`, `assistant`, `tool`
`choices[].message.content`	string \| object[]	Optional	The contents of the message
`choices[].message.name`	string	Optional	The name of the author of this message
`choices[].message.tool_calls`	object[]	Optional	Tool calls generated by the model
`choices[].message.tool_call_id`	string	Optional	Tool call that this message is responding to
`choices[].finish_reason`	"stop" \| "length" \| "tool_calls" \| "content_filter"	Optional	Reason the model stopped generating Allowed values: `stop`, `length`, `tool_calls`, `content_filter`
`usage`	object	Optional
`usage.prompt_tokens`	integer	Optional	Number of tokens in the prompt
`usage.completion_tokens`	integer	Optional	Number of tokens in the completion
`usage.total_tokens`	integer	Optional	Total number of tokens used

❌ Error codes

Http	Type	Description
400	`invalid_request_error`	Bad request Malformed JSON or missing required fields
401	`authentication_error`	Authentication failed Invalid API key
402	`insufficient_credits`	Insufficient credits Point balance is zero or negative
429	`rate_limit_error`	Rate limit exceeded Rate limit exceeded (500 requests per minute)

Best Practices

Streaming vs Non-Streaming

Use streaming (stream: true) for better user experience in chat interfaces. Users see responses as they generate rather than waiting for completion.

For most text-based models, streaming provides a better user experience:

Users see responses immediately as they generate
Lower perceived latency
Better for long-form content

For image/video/audio generation models, use non-streaming mode:

These models typically return complete outputs
Streaming may not work as expected

Error Handling

Always implement retry logic with exponential backoff for production applications. Rate limits (429) and temporary failures (503) should be retried.

Implement proper error handling for all API calls:

async function chatWithRetry(payload, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url, { method: 'POST', ...options });

      if (!response.ok) {
        const error = await response.json();

        // Retry on rate limit or server errors
        if ([429, 500, 502, 503].includes(response.status) && attempt < maxRetries) {
          const delay = Math.min(1000 * Math.pow(2, attempt), 10000);
          await new Promise(resolve => setTimeout(resolve, delay));
          continue;
        }

        throw new Error(`API error: ${error.message}`);
      }

      return await response.json();
    } catch (err) {
      if (attempt === maxRetries) throw err;
    }
  }
}

Rate Limiting

The API has a rate limit of 500 requests per minute. Monitor the X-RateLimit-Remaining response header to track your usage.

Monitor rate limits in production:

Check X-RateLimit-Remaining header
Implement request queuing when approaching limits
Consider caching responses when appropriate

🔁 Callbacks & webhooks

No callbacks or webhooks are associated with this endpoint.

Code example

Send structured messages to Poe models and stream assistant responses in real time.

curl https://api.poe.com/v1/chat/completions \
-X POST \
-H "Authorization: Bearer sk_test_51SAMPLEKEY" \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-d '{
  "model": "Claude-Sonnet-4",
  "messages": [
    {
      "role": "user",
      "content": "What are the top 3 things to do in New York?"
    }
  ]
}'

Normal response

{
"id": "chatcmpl-2Nhd9xBFbLcXEwmNj",
"object": "chat.completion",
"created": 1704825600,
"model": "Claude-Sonnet-4",
"choices": [
  {
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Here are the top 3 things to do in New York:\n\n1. Visit Central Park\n2. See the Statue of Liberty\n3. Experience Times Square"
    },
    "finish_reason": "stop"
  }
],
"usage": {
  "prompt_tokens": 15,
  "completion_tokens": 45,
  "total_tokens": 60
}
}

Streaming response

data: {"id":"chatcmpl-2Nhd9xBFbLcXEwmNj","object":"chat.completion.chunk","created":1704825600,"model":"Claude-Sonnet-4","choices":[{"index":0,"delta":{"role":"assistant","content":"Here"},"finish_reason":null}]}

data: {"id":"chatcmpl-2Nhd9xBFbLcXEwmNj","object":"chat.completion.chunk","created":1704825600,"model":"Claude-Sonnet-4","choices":[{"index":0,"delta":{"content":" are"},"finish_reason":null}]}

data: [DONE]