Overview
Scheduled jobs allow you to run the same FindAll query on a regular basis to discover newly emerging entities and track changes to existing ones. This is ideal for ongoing monitoring use cases like market intelligence, lead generation, or competitive tracking. Rather than manually re-running queries, you can programmatically create new FindAll runs using a previous run’s schema, while excluding candidates you’ve already discovered.Use Cases
Scheduled FindAll jobs are particularly useful for:- Market monitoring: Track new companies entering a market space over time
- Lead generation: Continuously discover new potential customers matching your criteria
- Competitive intelligence: Monitor emerging competitors and new funding announcements
- Investment research: Track new companies meeting specific investment criteria
- Regulatory compliance: Discover new entities that may require compliance review
How It Works
Creating a scheduled FindAll job involves two steps:- Retrieve the schema from a previous successful run
- Create a new run using that schema, with an exclude list of previously discovered candidates
- Consistent criteria: Use the exact same evaluation logic across runs
- No duplicates: Automatically exclude candidates from previous runs
- Cost efficiency: Only pay to evaluate net new candidates
Step 1: Retrieve the Schema
Get the schema from a completed FindAll run to reuse itsentity_type, match_conditions, and enrichments:
cURL
Step 2: Create a New Run with exclude_list
Use the retrieved schema to create a new FindAll run, adding an exclude_list parameter to skip candidates you’ve already discovered:
cURL
Exclude List Parameters
Theexclude_list is an array of candidate objects to exclude. Each object contains:
| Parameter | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Name of the candidate to exclude |
url | string | Yes | URL of the candidate to exclude |
- Candidates matching any entry in the
exclude_listwill be skipped during generation - This prevents re-evaluating entities you’ve already processed
- Exclusions are matched by URL—ensure URLs are normalized consistently across runs
Building Your Exclude List
To construct theexclude_list from previous runs, retrieve the matched candidates and extract their name and url fields:
cURL
name and url fields from each matched candidate:
exclude_list array in subsequent runs.
Example: Weekly Scheduled Job
Here’s a complete example showing how to set up a weekly FindAll job:Best Practices
Schema Modifications
While you should keepmatch_conditions consistent across runs, you can adjust:
objective: Update to reflect the current time period (e.g., “founded in 2024” → “founded in 2025”)enrichments: Add new enrichment fields without affecting matching logicmatch_limit: Adjust based on expected growth rategenerator: Change generators if needed (though this may affect result quality)
Exclude List Management
- Persist candidates: Store discovered candidate objects (name and URL) in a database or file for long-term tracking
- Normalize URLs: Ensure consistent URL formatting (trailing slashes, protocols, etc.) across runs
- Periodic resets: Consider occasionally running without exclusions to catch entities that may have changed
- Monitor list size: Very large exclude lists (>10,000 candidates) may impact performance
Scheduling
- Frequency: Choose intervals based on your domain’s update rate (daily, weekly, monthly)
- Off-peak hours: Schedule jobs during low-traffic periods if possible
- Webhooks: Use webhooks to get notified when jobs complete
- Error handling: Implement retry logic for failed runs
Cost Optimization
- Start small: Use lower
match_limitvalues initially, then extend if needed - Preview first: Test schema changes with preview before running full jobs
- Monitor metrics: Track
generated_candidates_countvsmatched_candidates_countto optimize criteria
Related Topics
- Preview: Test queries with ~10 candidates before running full searches
- Generators and Pricing: Understand generator options and pricing
- Enrichments: Extract additional structured data for matched candidates
- Extend Runs: Increase match limits without paying new fixed costs
- Webhooks: Configure HTTP callbacks for run completion and matches
- Streaming Events: Receive real-time updates via Server-Sent Events
- Run Lifecycle: Understand run statuses and how to cancel runs
- API Reference: Complete endpoint documentation