speed model for low latency across a
broad range of use cases, to research models (lite, base, core) for deeper research-grade outputs
where you can afford to wait longer for even more comprehensive responses with full research basis support.
Beta Notice: Parallel Chat is in beta. We provide a rate limit of 300 requests
per minute for the Chat API out of the box. Contact us
for production capacity.
Choosing the Right Model
The Chat API supports both thespeed model for
low latency applications and research models for deeper outputs.
Research models (lite, base, core) are Chat API wrappers over our Task API processors,
providing the same research capabilities along with basis in an OpenAI-compatible interface.
| Model | Best For | Basis Support | Latency (TTFT) |
|---|---|---|---|
speed | Low latency across a broad range of use cases | No | ~3s |
lite | Simple lookups, basic metadata | Yes | 10-60s |
base | Standard enrichments, factual queries | Yes | 15-100s |
core | Complex research, multi-source synthesis | Yes | 60s-5min |
Getting Started with the OpenAI SDK
To use the OpenAI SDK compatibility feature, you’ll need to:- Use an official OpenAI SDK
- Make these changes:
- Update your base URL to point to Parallel’s beta API endpoint
- Replace your API key with a Parallel API key
- Update your model name to
speed,lite,base, orcore
- Review the documentation below for supported features
Performance and Rate Limits
Speed is optimized for interactive applications requiring low latency responses:- Performance: With
stream=true, achieves 3 second p50 TTFT (median time to first token) - Default Rate Limit: 300 requests per minute
- Use Cases: Chat interfaces, interactive tools
Example Execution
System Prompt
You can provide a custom system prompt to control the AI’s behavior and response style by including it in the messages array with"role": "system" as the first message in your request.
Using Research Models
When you use research models (lite, base, or core) instead of speed, the
Chat API provides research-grade outputs with
full research basis support.
The basis includes citations, reasoning, and confidence levels for each response.