Advanced Usage
Learn advanced features and optimization techniques for Scrapezy
This guide covers advanced features and optimization techniques for getting the most out of Scrapezy's API.
Caching and Performance
Using the Cache
By default, Scrapezy caches extraction results to improve performance and reduce costs. You can bypass the cache when needed:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/products",
"prompt": "Extract all product information",
"options": {
"bypassCache": true
}
}
Schema-Driven Data Extraction
Defining Data Schemas
For consistent and validated data extraction, define schemas that specify the expected structure:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/product",
"prompt": "Extract product information according to the schema",
"schema": {
"name": "E-commerce Product Schema",
"fields": [
{
"name": "productName",
"type": "string",
"required": true,
"description": "Full product name"
},
{
"name": "price",
"type": "object",
"required": true,
"description": "Price information with currency",
"fields": [
{
"name": "amount",
"type": "number",
"required": true,
"description": "Price amount"
},
{
"name": "currency",
"type": "string",
"required": true,
"description": "Currency code (e.g., USD, EUR)"
}
]
},
{
"name": "specifications",
"type": "array",
"required": false,
"description": "Array of product specifications"
},
{
"name": "availability",
"type": "object",
"required": false,
"description": "Stock and shipping information"
}
]
}
}
Complex Schema Examples
For nested data structures, schemas help ensure consistent extraction:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/articles",
"prompt": "Extract article information with author details",
"schema": {
"name": "Article with Author Schema",
"fields": [
{
"name": "title",
"type": "string",
"required": true,
"description": "Article title"
},
{
"name": "author",
"type": "object",
"required": true,
"description": "Author information",
"fields": [
{
"name": "name",
"type": "string",
"required": true,
"description": "Author's full name"
},
{
"name": "bio",
"type": "string",
"required": false,
"description": "Author's biography"
},
{
"name": "socialLinks",
"type": "array",
"required": false,
"description": "Array of social media links"
}
]
},
{
"name": "publishDate",
"type": "date",
"required": true,
"description": "Publication date in ISO format"
},
{
"name": "tags",
"type": "array",
"required": false,
"description": "Article tags or categories"
}
]
}
}
Advanced Prompting Techniques
Structured Data Extraction
For complex data structures, be specific about the format you need:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/product",
"prompt": "Extract the following product details:\n- Full product name\n- Base price (without currency symbol)\n- Currency used\n- All available color options\n- Technical specifications in a structured format\n- Shipping information including:\n - Available countries\n - Estimated delivery times\n - Shipping costs"
}
Example response:
{
"jobId": "job_abc123",
"status": "completed",
"result": {
"productName": "iPhone 15 Pro",
"basePrice": 999,
"currency": "USD",
"colors": ["Natural Titanium", "Blue Titanium", "White Titanium", "Black Titanium"],
"specifications": {
"display": "6.1-inch Super Retina XDR",
"chip": "A17 Pro",
"camera": "48MP Main",
"storage": ["128GB", "256GB", "512GB", "1TB"]
},
"shipping": {
"availableCountries": ["US", "UK", "CA", "AU"],
"deliveryTimes": {
"US": "1-3 business days",
"International": "5-7 business days"
},
"costs": {
"US": "Free",
"International": "From $20"
}
}
}
}
Contextual Extraction
Provide context in your prompts for more accurate results:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/article",
"prompt": "This is a news article. Extract:\n1. The main headline\n2. Author's name and their role if mentioned\n3. Publication date in ISO format\n4. The article's main topic or category\n5. Any quoted sources or experts mentioned\n6. Key statistics or numerical data points"
}
Example response:
{
"jobId": "job_def456",
"status": "completed",
"result": {
"headline": "Global AI Market Expected to Reach $190B by 2025",
"author": {
"name": "Jane Smith",
"role": "Senior Technology Analyst"
},
"publicationDate": "2024-02-14T09:00:00Z",
"category": "Technology",
"quotedSources": [
{
"name": "Dr. John Doe",
"title": "AI Research Director at Tech Institute",
"quote": "AI adoption is accelerating faster than predicted"
}
],
"statistics": [
{
"metric": "Market Size",
"value": 190,
"unit": "billion USD",
"year": 2025
},
{
"metric": "Annual Growth Rate",
"value": 37.3,
"unit": "percent"
}
]
}
}
Next Steps
- Schema Validation Guide - Complete guide to using schemas for data validation
- API Reference - Full API documentation
- Error Codes - Understanding API error responses