Documentation Index
Fetch the complete documentation index at: https://docs.parallel.ai/llms.txt
Use this file to discover all available pages before exploring further.
For AI agents: a documentation index is available at https://docs.parallel.ai/llms.txt. The full text of all docs is at https://docs.parallel.ai/llms-full.txt. You may also fetch any page as Markdown by appending
This integration is ideal for data engineers who need to enrich large datasets with web intelligence directly in their Snowflake pipelines—without leaving SQL or building custom API integrations.
Parallel provides a SQL-native User Defined Table Function (UDTF) for Snowflake that enables data enrichment directly in your SQL queries. The integration uses Snowflake’s External Access feature to securely connect to the Parallel API, and batches all rows in a partition into a single API call for efficient processing.
.md to its URL or sending Accept: text/markdown.Features
- SQL-Native: Use
parallel_enrich()directly in Snowflake SQL queries - Batched Processing: All rows in a partition are sent in a single API call using
end_partition() - Secure: API key stored as Snowflake Secret, accessed via External Access Integration
- Configurable Processors: Choose from lite-fast to pro for speed vs thoroughness tradeoffs
- Structured Output: Returns VARIANT columns for input and enriched data
Installation
The standalone
parallel-cli binary does not include deployment commands. You must install via pip with the [snowflake] extra to deploy the Snowflake integration.Deployment
The Snowflake integration requires a one-time deployment step to set up the External Access Integration, secrets, and UDTF in your Snowflake account.Prerequisites
- Snowflake Account - Paid account required (trial accounts don’t support External Access)
- ACCOUNTADMIN Role - Required for creating External Access Integrations
- Parallel API Key from Parallel
Finding Your Account Identifier
Your Snowflake account identifier is in your Snowsight URL:Deploy with CLI
- Database:
PARALLEL_INTEGRATION - Schema:
ENRICHMENT - Network rule for
api.parallel.ai - Secret with your API key
- External Access Integration
parallel_enrich()UDTF (batched table function)- Roles:
PARALLEL_DEVELOPERandPARALLEL_USER
For manual deployment options (useful if you don’t have ACCOUNTADMIN), troubleshooting, MFA setup, and cleanup instructions, see the complete Snowflake setup guide.
Basic Usage
Theparallel_enrich() function is a table function (UDTF) that requires the TABLE(...) OVER (PARTITION BY ...) syntax:
| company_name | website | ceo_name | founding_year |
|---|---|---|---|
| google.com | Sundar Pichai | 1998 | |
| Anthropic | anthropic.com | Dario Amodei | 2021 |
| Apple | apple.com | Tim Cook | 1976 |
Function Parameters
| Parameter | Type | Description |
|---|---|---|
input_json | VARCHAR | JSON string via TO_JSON(OBJECT_CONSTRUCT(...)) |
output_columns | ARRAY | Array of descriptions for columns you want to enrich |
processor | VARCHAR | (Optional) Processor to use (default: lite-fast) |
Return Values
The function returns a table with two VARIANT columns:| Column | Description |
|---|---|
input | Original input data as VARIANT |
enriched | Enrichment results including basis citations |
enriched column contains:
ceo_name).
Custom Processor
Override the default processor by adding a third parameter:Batching with PARTITION BY
ThePARTITION BY clause controls how rows are batched into API calls. All rows in the same partition are sent together in a single API request.
All Rows in One Batch
Batch by Column
Fixed Batch Sizes
Choosing a Partition Strategy
| Pattern | Use Case |
|---|---|
PARTITION BY 1 | Small datasets (under 1000 rows), fastest for few rows |
PARTITION BY column | Large datasets, natural groupings, incremental processing |
PARTITION BY batch_id | Fixed batch sizes for very large datasets |
Processor Selection
Choose a processor based on your speed vs thoroughness requirements. See Choose a Processor for detailed guidance and Pricing for cost information.| Processor | Speed | Best For |
|---|---|---|
lite-fast | Fastest | Basic metadata, high volume |
base-fast | Fast | Standard enrichments |
core-fast | Medium | Cross-referenced data |
pro-fast | Slower | Deep research |
Best Practices
Use PARTITION BY 1 for small datasets
Use PARTITION BY 1 for small datasets
For smaller datasets, batch all rows together for maximum efficiency:
Use specific descriptions
Use specific descriptions
Be specific in your output column descriptions for better results:
Cache results
Cache results
Store enriched results in a table to avoid re-processing:
Incremental processing
Incremental processing
Process new records daily using date partitioning:
Security
The integration uses Snowflake’s security features:- Network Rule: Only allows egress to
api.parallel.ai:443 - Secret: API key stored encrypted (not visible in SQL)
- External Access Integration: Combines network rule and secret
- Roles:
PARALLEL_USERfor query access,PARALLEL_DEVELOPERfor UDF management