Documentation Index
Fetch the complete documentation index at: https://docs.parallel.ai/llms.txt
Use this file to discover all available pages before exploring further.
For AI agents: a documentation index is available at https://docs.parallel.ai/llms.txt. The full text of all docs is at https://docs.parallel.ai/llms-full.txt. You may also fetch any page as Markdown by appending
This integration is ideal for data engineers who need to enrich large datasets with web intelligence directly in their BigQuery pipelines—without leaving SQL or building custom API integrations.
Parallel provides SQL-native remote functions for Google BigQuery that enable data enrichment directly in your SQL queries. The integration uses Cloud Functions to securely connect BigQuery to the Parallel API.
.md to its URL or sending Accept: text/markdown.Features
- SQL-Native: Use
parallel_enrich()directly in BigQuery SQL queries - Secure: API key stored in Secret Manager, accessed via Cloud Functions
- Configurable Processors: Choose from lite-fast to ultra for speed vs thoroughness tradeoffs
- Structured Output: Returns JSON that can be parsed with BigQuery’s
JSON_EXTRACT_SCALAR()
Installation
The standalone
parallel-cli binary does not include deployment commands. You must install via pip to deploy the BigQuery integration.Deployment
Unlike Spark, the BigQuery integration requires a one-time deployment step to set up Cloud Functions and remote function definitions in your GCP project.Prerequisites
- Google Cloud Project with billing enabled
- Parallel API Key from Parallel
-
Google Cloud SDK installed and authenticated:
Deploy with CLI
- Secret in Secret Manager for your API key
- Cloud Function (Gen2) that handles enrichment requests
- BigQuery Connection for remote function calls
- BigQuery Dataset (
parallel_functions) - Remote functions:
parallel_enrich()andparallel_enrich_company()
For manual deployment options, troubleshooting, and cleanup instructions, see the complete BigQuery setup guide.
Basic Usage
Once deployed, useparallel_enrich() in any BigQuery SQL query:
Function Parameters
| Parameter | Type | Description |
|---|---|---|
input_data | JSON | JSON object with key-value pairs of input data for enrichment |
output_columns | JSON | JSON array of descriptions for columns you want to enrich |
Parsing Results
The function returns JSON strings. Field names are converted to snake_case (e.g., “CEO name” →ceo_name).
Use JSON_EXTRACT_SCALAR() to extract individual fields:
Company Convenience Function
For common company enrichment use cases:Processor Selection
Choose a processor based on your speed vs thoroughness requirements. See Choose a Processor for detailed guidance and Pricing for cost information. To use a different processor, create a custom remote function with the desired processor in theuser_defined_context:
Best Practices
Batch sizing
Batch sizing
Process data in batches to manage costs and avoid timeouts:
Error handling
Error handling
Failed enrichments return JSON with an Filter these in your downstream processing.
error field:Cost management
Cost management
- Use
lite-fastfor high-volume, basic enrichments - Test with small batches before processing large tables
- Store results to avoid re-enriching the same data