Features
- DataFrame-Native: Enriched columns added directly to your Polars DataFrame
- Batch Processing: All rows processed in a single API call for efficiency
- LazyFrame Support: Works with both eager and lazy DataFrames
- Partial Results: Failed rows return
Nonewithout stopping the entire batch
Installation
Basic Usage
| company | website | ceo_name | founding_year | headquarters_city |
|---|---|---|---|---|
| google.com | Sundar Pichai | 1998 | Mountain View | |
| Microsoft | microsoft.com | Satya Nadella | 1975 | Redmond |
| Apple | apple.com | Tim Cook | 1976 | Cupertino |
Function Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
df | pl.DataFrame | required | DataFrame to enrich |
input_columns | dict[str, str] | required | Mapping of input descriptions to column names |
output_columns | list[str] | required | List of output column descriptions |
api_key | str | None | None | API key (uses PARALLEL_API_KEY env var if not provided) |
processor | str | "lite-fast" | Parallel processor to use |
timeout | int | 600 | Timeout in seconds |
include_basis | bool | False | Include citations in results |
Return Value
The function returns anEnrichmentResult dataclass:
Column Name Mapping
Output column descriptions are automatically converted to valid Python identifiers. Field names are converted to snake_case:| Description | Column Name |
|---|---|
"CEO name" | ceo_name |
"Founding year (YYYY)" | founding_year |
"Annual revenue [USD]" | annual_revenue |
LazyFrame Support
Useparallel_enrich_lazy() to work with LazyFrames:
Including Citations
Processor Selection
Choose a processor based on your speed vs thoroughness requirements. See Choose a Processor for detailed guidance and Pricing for cost information.| Processor | Speed | Best For |
|---|---|---|
lite-fast | Fastest | Basic metadata, high volume |
base-fast | Fast | Standard enrichments |
core-fast | Medium | Cross-referenced data |
pro-fast | Slower | Deep research |
Best Practices
Use specific descriptions
Use specific descriptions
Be specific in your output column descriptions for better results:
Handle errors gracefully
Handle errors gracefully
Errors don’t stop processing - partial results are returned:
Batch large datasets
Batch large datasets
For very large datasets (1000+ rows), consider processing in batches:
Cost management
Cost management
- Use
lite-fastfor high-volume, basic enrichments - Test with small batches before processing large DataFrames
- Store results to avoid re-enriching the same data