Rate Limits
Overview
The Poe API implements rate limiting to ensure fair usage and maintain service quality for all users. Understanding and properly handling these limits is essential for building robust integrations.
Current Rate Limit: 500 requests per minute (RPM)
Need Higher Limits?
If your application requires higher rate limits, please reach out to our team at [email protected] to discuss your use case.
Rate Limit Headers
Every API response includes headers that help you track your current rate limit status:
| Header | Description | Example Value |
|---|---|---|
x-ratelimit-limit-requests | Maximum number of requests allowed per minute | 500 |
x-ratelimit-remaining-requests | Number of requests remaining in the current window | 499 |
x-ratelimit-reset-requests | Time in seconds until the rate limit window resets | 1 |
Example Response Headers:
< x-ratelimit-remaining-requests: 499
< x-ratelimit-limit-requests: 500
< x-ratelimit-reset-requests: 1Monitor Your Usage
Use these headers to track your consumption and implement proactive throttling before hitting the limit.
Handling Rate Limits Gracefully
When your application exceeds the rate limit, the API will respond with a 429 Too Many Requests status code. Implementing proper retry logic is essential for a robust integration.
Watch for 429 Status Codes
Your application should always check for HTTP 429 responses and handle them appropriately rather than treating them as fatal errors.
Build Retry Logic
Never ignore 429 responses. Always implement a retry mechanism to ensure your application continues working once the rate limit window resets.
Exponential Backoff Strategy
A basic technique for handling rate limits is to implement exponential backoff when you receive a 429 response:
- Initial retry delay: Start with a short delay (e.g., 1 second)
- Increase exponentially: Double the delay with each subsequent 429 (1s → 2s → 4s → 8s)
- Add randomness (jitter): Add random variation to prevent thundering herd effects
- Set a maximum: Cap the maximum retry delay to avoid indefinite waits
Why add randomness? If many clients hit the limit simultaneously and all retry at the same intervals, they'll create synchronized waves of traffic. Adding jitter (random delays) spreads out the retry attempts.
Global Traffic Control
For more sophisticated applications, consider implementing rate limiting on the client side:
- Token bucket algorithm: A proven approach for controlling request rates
- Global tracking: Monitor rate limit consumption across all parts of your application
- Proactive throttling: Reduce request volume when you detect you're approaching limits
- Circuit breaker pattern: Temporarily stop requests when rate limits are consistently exceeded
Optimize Request Patterns
While retry logic is essential, the best approach is to design your application to stay within rate limits. Batch operations where possible, cache responses, and avoid unnecessary API calls.
Load Testing
If you're preparing for a major event or want to test your integration under load, follow these best practices:
Mock Out API Requests
Build a configurable system for mocking Poe API requests during load tests. This allows you to:
- Test your application's behavior at scale without consuming your rate limit
- Avoid affecting production API availability
- Simulate various response scenarios (including rate limit errors)
Simulate Realistic Latency
For accurate load test results, simulate network latency in your mocked responses:
- Sample real API calls: Measure actual response times from live Poe API requests
- Apply delays: Use these measurements to add realistic sleep times to your mocked responses
- Vary the delays: Use a distribution of latency values rather than a fixed delay
This approach ensures your load tests reflect real-world performance characteristics.
Don't Load Test Production
Never run load tests against the live Poe API. Always use mocked endpoints to avoid consuming your rate limit and affecting service availability.
Usage Tracking
Usage API
If you need detailed information about your API usage, including a log of all your API calls, use the Usage API.
The Usage API provides:
- Historical usage data
- Request counts and patterns
- Point consumption tracking
- Detailed call logs
Learn more about tracking your API usage in the Usage API documentation.
Best Practices Summary
- ✅ Monitor rate limit headers in every response
- ✅ Implement exponential backoff with jitter for 429 responses
- ✅ Add client-side rate limiting for high-volume applications
- ✅ Cache responses where appropriate to reduce API calls
- ✅ Use mocked endpoints for load testing
- ✅ Track usage with the Usage API to understand your patterns
- ❌ Never ignore 429 errors - always retry with backoff
- ❌ Don't retry immediately - respect the rate limit reset time
- ❌ Don't load test production - use mocks instead
Contact Us
Need help optimizing your integration or require higher limits? Reach out to [email protected] - we're here to help!