Documentation Index
Fetch the complete documentation index at: https://docs.parallel.ai/llms.txt
Use this file to discover all available pages before exploring further.
For AI agents: a documentation index is available at https://docs.parallel.ai/llms.txt. The full text of all docs is at https://docs.parallel.ai/llms-full.txt. You may also fetch any page as Markdown by appending
Parallel Chat is a web research API that returns OpenAI ChatCompletions compatible streaming text and JSON.
The Chat API supports multiple models—from the .md to its URL or sending Accept: text/markdown.speed model for low latency across a
broad range of use cases, to research models (lite, base, core) for deeper research-grade outputs
where you can afford to wait longer for even more comprehensive responses with full research basis support.
Beta Notice: Parallel Chat is in beta. We provide a rate limit of 300 requests
per minute for the Chat API out of the box. Contact us
for production capacity.
Choosing the Right Model
The Chat API supports both thespeed model for
low latency applications and research models for deeper outputs.
Research models (lite, base, core) are Chat API wrappers over our Task API processors,
providing the same research capabilities along with basis in an OpenAI-compatible interface.
| Model | Best For | Basis Support | Latency (TTFT) |
|---|---|---|---|
speed | Low latency across a broad range of use cases | No | ~3s |
lite | Simple lookups, basic metadata | Yes | 10-60s |
base | Standard enrichments, factual queries | Yes | 15-100s |
core | Complex research, multi-source synthesis | Yes | 60s-5min |
1. Set up Prerequisites
The Chat API is fully compatible with the OpenAI SDK — just swap the base URL and API key. Generate your API key on Platform, then install the OpenAI SDK:Performance and Rate Limits
Speed is optimized for interactive applications requiring low latency responses:- Performance: With
stream=true, achieves 3 second p50 TTFT (median time to first token) - Default Rate Limit: 300 requests per minute
- Use Cases: Chat interfaces, interactive tools
2. Make Your First Request
System Prompt
You can provide a custom system prompt to control the AI’s behavior and response style by including it in the messages array with"role": "system" as the first message in your request.
Using Research Models
When you use research models (lite, base, or core) instead of speed, the
Chat API provides research-grade outputs with
full research basis support.
The basis includes citations, reasoning, and confidence levels for each response.
Example with Research Model
OpenAI SDK Compatibility
Research Basis via OpenAI SDK: When using task processors (
lite, base, core) with the Chat API, the response includes a basis field with citations, reasoning, and confidence levels. Access it via response.basis in Python or (response as any).basis in TypeScript. See Basis documentation for details.Important OpenAI Compatibility Limitations
API Behavior
Here are the most substantial differences from using OpenAI:- Multimodal input (images/audio) is not supported and will be ignored.
- Prompt caching is not supported.
- Most unsupported fields are silently ignored rather than producing errors. These are all documented below.
Detailed OpenAI Compatible API Support
Request Fields
Simple Fields
| Field | Support Status |
|---|---|
| model | Use speed, lite, base, or core |
| response_format | Fully supported |
| stream | Fully supported |
| max_tokens | Ignored |
| max_completion_tokens | Ignored |
| stream_options | Ignored |
| top_p | Ignored |
| parallel_tool_calls | Ignored |
| stop | Ignored |
| temperature | Ignored |
| n | Ignored |
| logprobs | Ignored |
| metadata | Ignored |
| prediction | Ignored |
| presence_penalty | Ignored |
| frequency_penalty | Ignored |
| seed | Ignored |
| service_tier | Ignored |
| audio | Ignored |
| logit_bias | Ignored |
| store | Ignored |
| user | Ignored |
| modalities | Ignored |
| top_logprobs | Ignored |
| reasoning_effort | Ignored |
Tools / Functions Fields
Tools are ignored.Messages Array Fields
| Field | Support Status |
|---|---|
| messages[].role | Fully supported |
| messages[].content | String only |
| messages[].name | Fully supported |
| messages[].tool_calls | Ignored |
| messages[].tool_call_id | Ignored |
| messages[].function_call | Ignored |
| messages[].audio | Ignored |
| messages[].modalities | Ignored |
The
content field only supports string values. Structured content arrays (e.g., for multimodal inputs with text and image parts) are not supported.Response Fields
| Field | Support Status |
|---|---|
| id | Always empty |
| choices[] | Will always have a length of 1 |
| choices[].finish_reason | Always empty |
| choices[].index | Fully supported |
| choices[].message.role | Fully supported |
| choices[].message.content | Fully supported |
| choices[].message.tool_calls | Always empty |
| object | Always empty |
| created | Fully supported |
| model | Always empty |
| finish_reason | Always empty |
| content | Fully supported |
| usage.completion_tokens | Always empty |
| usage.prompt_tokens | Always empty |
| usage.total_tokens | Always empty |
| usage.completion_tokens_details | Always empty |
| usage.prompt_tokens_details | Always empty |
| choices[].message.refusal | Always empty |
| choices[].message.audio | Always empty |
| logprobs | Always empty |
| service_tier | Always empty |
| system_fingerprint | Always empty |
Parallel-Specific Response Fields
The following fields are Parallel extensions not present in the OpenAI API:| Field | Support Status |
|---|---|
| basis | Supported with task processors (lite, base, core) |
Error Message Compatibility
The compatibility layer maintains approximately the same error formats as the OpenAI API.Header Compatibility
While the OpenAI SDK automatically manages headers, here is the complete list of headers supported by Parallel’s API for developers who need to work with them directly.| Field | Support Status |
|---|---|
| authorization | Fully supported |
| x-ratelimit-limit-requests | Ignored |
| x-ratelimit-limit-tokens | Ignored |
| x-ratelimit-remaining-requests | Ignored |
| x-ratelimit-remaining-tokens | Ignored |
| x-ratelimit-reset-requests | Ignored |
| x-ratelimit-reset-tokens | Ignored |
| retry-after | Ignored |
| x-request-id | Ignored |
| openai-version | Ignored |
| openai-processing-ms | Ignored |