Documentation Index
Fetch the complete documentation index at: https://docs.parallel.ai/llms.txt
Use this file to discover all available pages before exploring further.
For AI agents: a documentation index is available at https://docs.parallel.ai/llms.txt. The full text of all docs is at https://docs.parallel.ai/llms-full.txt. You may also fetch any page as Markdown by appending
This integration is ideal for data engineers and analysts who work with DuckDB and need to enrich data with web intelligence directly in their SQL or Python workflows.
Parallel provides a native DuckDB integration with two approaches: batch processing for efficiency, and SQL UDFs for flexibility.
.md to its URL or sending Accept: text/markdown.Features
- Batch Processing: Process all rows in parallel with a single API call (recommended)
- SQL UDF: Use
parallel_enrich()directly in SQL queries - Progress Callbacks: Track enrichment progress for large datasets
- Permanent Tables: Optionally save results to a new table
Installation
Basic Usage - Batch Processing
Batch processing is the recommended approach for enriching multiple rows efficiently.| name | website | ceo_name | founding_year | headquarters_city |
|---|---|---|---|---|
| google.com | Sundar Pichai | 1998 | Mountain View | |
| Microsoft | microsoft.com | Satya Nadella | 1975 | Redmond |
| Apple | apple.com | Tim Cook | 1976 | Cupertino |
Function Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
conn | DuckDBPyConnection | required | DuckDB connection |
source_table | str | required | Table name or SQL query |
input_columns | dict[str, str] | required | Mapping of input descriptions to column names |
output_columns | list[str] | required | List of output column descriptions |
result_table | str | None | None | Optional permanent table to create |
api_key | str | None | None | API key (uses PARALLEL_API_KEY env var if not provided) |
processor | str | "lite-fast" | Parallel processor to use |
timeout | int | 600 | Timeout in seconds |
include_basis | bool | False | Include citations in results |
progress_callback | Callable | None | Callback for progress updates |
Return Value
The function returns anEnrichmentResult dataclass:
Column Name Mapping
Output column descriptions are automatically converted to valid SQL identifiers. Field names are converted to snake_case:| Description | Column Name |
|---|---|
"CEO name" | ceo_name |
"Founding year (YYYY)" | founding_year |
"Annual revenue [USD]" | annual_revenue |
SQL Query as Source
You can pass a SQL query instead of a table name:Creating Permanent Tables
Save enriched results to a permanent table:Progress Tracking
Track progress for large enrichment jobs:SQL UDF Usage
For flexibility in SQL queries, you can register aparallel_enrich() function:
The SQL UDF processes rows individually. For better performance with multiple rows, use batch processing with
enrich_table().Including Citations
Processor Selection
Choose a processor based on your speed vs thoroughness requirements. See Choose a Processor for detailed guidance and Pricing for cost information.| Processor | Speed | Best For |
|---|---|---|
lite-fast | Fastest | Basic metadata, high volume |
base-fast | Fast | Standard enrichments |
core-fast | Medium | Cross-referenced data |
pro-fast | Slower | Deep research |
Best Practices
Use batch processing for multiple rows
Use batch processing for multiple rows
Batch processing is significantly faster (4-5x or more) than the SQL UDF for multiple rows:
Use specific descriptions
Use specific descriptions
Be specific in your output column descriptions for better results:
Handle errors gracefully
Handle errors gracefully
Errors don’t stop processing - partial results are returned:
Cost management
Cost management
- Use
lite-fastfor high-volume, basic enrichments - Test with small batches before processing large tables
- Store results in permanent tables to avoid re-enriching