Skip to main content
The Parallel Search API is available in Google Vertex AI as an external grounding provider. Use it to ground Gemini model responses with up-to-date context from the public web. There are two ways to get started:
Google Cloud MarketplaceBring Your Own Key (BYOK)
SetupSubscribe via Google Cloud MarketplaceGet an API key from Parallel Platform
AuthenticationAutomatic — no API key neededAPI key passed in each request
BillingConsolidated through Google CloudBilled through Parallel
Quota200 prompts per minute200 prompts per minute
Read Google’s official documentation here.

Use cases

  • Using web data for information completion or enrichment.
  • Multi-hop agents that require deeper web searches for complex questions.
  • Building APIs that integrate web search data.
  • Employee-facing assistants for up-to-date analysis and reporting.
  • Consumer apps (retail, travel) supporting informed purchase decisions.
  • Automated agents (e.g., news analysis, KYC checks).
  • Vertical agents (sales, coding, finance) fetching the latest context from the web.

Example

Who won the 2025 Las Vegas F1 Grand Prix?
Without GroundingWith Grounding
The 2025 Las Vegas Grand Prix has not happened yet. The race is scheduled to take place on the weekend of November 20-22, 2025. Therefore, the winner is currently unknown.The winner of the 2025 Las Vegas F1 Grand Prix was Max Verstappen of Red Bull Racing. The race took place on November 22, 2025. Sources: domain1.com, domain2.com, …

Supported models

The following models support Grounding with Parallel web search:
  • Gemini 3 Flash
  • Gemini 3 Pro Image
  • Gemini 2.5 Pro
  • Gemini 2.5 Flash
  • Gemini 2.5 Flash-Lite
  • Gemini 2.5 Flash with Live API native audio
  • Gemini 2.0 Flash with Live API
  • Gemini 2.0 Flash

Setup

Vertex AI Studio

You can also use Parallel as a grounding source directly in the Vertex AI Studio UI — no code required. This requires an active Google Cloud Marketplace subscription.
  1. Open Vertex AI Studio in the Google Cloud Console.
  2. Select a supported Gemini model.
  3. In the grounding configuration, select Parallel Web Search as the grounding source.
  4. Enter your prompt and send — the model response will be grounded with web results from Parallel.
Vertex AI Studio is a great way to experiment with grounded responses before integrating via the API.

Make a grounded request

Use the Vertex AI REST API to request grounded responses from Gemini:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent
  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION: The region to process the request (e.g., us-central1). Omit from the endpoint to use the global endpoint.
  • MODEL_ID: The Gemini model to use (e.g., gemini-2.5-flash).
No api_key field is needed when using the Marketplace subscription:
{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "MODEL_PROMPT_TEXT"
    }]
  }],
  "tools": [{
    "parallelAiSearch": {
      "customConfigs": {
        "source_policy": {
          "exclude_domains": ["EXCLUDE_DOMAINS"],
          "include_domains": ["INCLUDE_DOMAINS"]
        },
        "excerpts": {
          "max_chars_per_result": MAX_CHARS_PER_RESULT,
          "max_chars_total": MAX_CHARS_TOTAL
        },
        "max_results": MAX_RESULTS
      }
    }
  }],
  "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID"
}
Execute the request:
curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"
If both a Marketplace subscription and an API key are present in a request, the API key takes precedence.

Configuration options

All customConfigs fields are optional. For best performance, use defaults unless you have specific requirements.
ParameterDefaultRangeDescription
max_results101–20Number of search results used for grounding
excerpts.max_chars_per_result30,0001,000–100,000Maximum characters per excerpt
excerpts.max_chars_total100,0001,000–1,000,000Maximum total excerpt characters
source_policy.include_domainsUp to 10Only return results from these domains
source_policy.exclude_domainsUp to 10Exclude results from these domains
For guidance on search queries and configuration, see Search API Best Practices.
For a complete working example, see the Vertex AI demo in the Parallel Cookbook.

Quota

The default quota is 200 prompts per minute. If you need higher rate limits, contact your Google account team (Marketplace) or support@parallel.ai (BYOK) with your use case and requirements.

Billing

Using Gemini with Parallel incurs charges from both Gemini token consumption and use of Parallel’s Search API.
  • Google Cloud Marketplace: Search API charges are consolidated into your Google Cloud billing.
  • Bring Your Own Key: Search API charges are billed through Parallel’s pricing.