Documentation Index
Fetch the complete documentation index at: https://docs.parallel.ai/llms.txt
Use this file to discover all available pages before exploring further.
For AI agents: a documentation index is available at https://docs.parallel.ai/llms.txt. The full text of all docs is at https://docs.parallel.ai/llms-full.txt. You may also fetch any page as Markdown by appending
Give locally-hosted models running under Ollama real-time web search by registering Parallel Search as a tool. This guide uses Ollama’s native Python SDK, which derives the tool schema directly from your function signature and docstring.
.md to its URL or sending Accept: text/markdown.Overview
Modern Ollama models (Qwen 3.5, Gemma 4, Llama 3.1+) support native tool calling: you passtools=[...] on a chat call, the model emits structured tool_calls, your code executes them, and you feed results back in a follow-up turn. By registering Parallel Search as a tool, your local model can:
- Search the web for current information
- Access real-time news, research, and facts
- Cite sources with URLs in responses
This guide uses the native
ollama Python SDK. If your application already speaks the OpenAI Chat Completions API — including TypeScript apps using openai — point your existing client at http://localhost:11434/v1 and follow the OpenAI Tool Calling guide unchanged.Prerequisites
- Install Ollama and start the daemon (
ollama serve) - Pull a tool-capable model (Qwen 3.5 has the most reliable tool calls)
- Get your Parallel API key from Platform
- Install the Python SDKs
Define the Search Tool
The Ollama Python SDK accepts plain Python functions as tools. It reads the parameter type hints and docstring (Google-style) to build the JSON schema automatically — no separate schema object required. See Search Tool Definition for the recommended objective + queries shape.Process Tool Calls
Pass the function directly intools, then dispatch any returned calls and append the results as role: "tool" messages:
Complete Example
End-to-end: a chat loop that lets the model decide when to search.Tool Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
objective | string | Yes | A concise, self-contained search query. Must include the key entity or topic being searched for. |
search_queries | string[] | Yes | Exactly 3 keyword search queries, each 3-6 words. Must be diverse — vary entity names, synonyms, and angles. |
Choosing a Model
Tool calling reliability varies sharply by model. From most to least dependable for this workflow:| Model | Notes |
|---|---|
qwen3.5:0.8b – qwen3.5:9b | Native tool calling across all sizes; the 0.8b runs on a laptop CPU. Recommended starting point. |
qwen3:8b and up | Stable tool calling, slightly older generation. |
gemma4:e4b | Native function calling; good quality if already pulled. |
llama3.1:8b | Works but more prone to malformed arguments at smaller sizes. |
Differences from the OpenAI Client
If you’re porting code from the OpenAI Tool Calling guide, three things change:| OpenAI client | Ollama native SDK | |
|---|---|---|
| Tool definition | Manual JSON schema object | Pass the Python function directly |
tool_calls arguments | JSON-encoded string (use json.loads) | Parsed dict |
| Tool result message field | tool_call_id | tool_name |
http://localhost:11434/v1 follows the OpenAI conventions instead — useful if you want to keep one code path across providers. Note that tool_choice is not supported on that endpoint.