Basic Usage
Learn how to use the API for basic data extraction tasks
Basic Usage Guide
This guide will walk you through the basic usage of our API for data extraction tasks.
Quick Example
Here's a simple example of how to extract data from a webpage:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/products",
"prompt": "Extract all product information from this page including names, prices, and descriptions"
}
Response:
{
"jobId": "job_123abc",
"status": "pending",
"estimatedTime": 10
}
Check job status:
GET https://scrapezy.com/api/v1/extract/job_123abc
x-api-key: your_api_key
Response when complete:
{
"status": "completed",
"result": {
"products": [
{
"name": "Example Product 1",
"price": "$99.99",
"description": "This is an example product description"
},
{
"name": "Example Product 2",
"price": "$149.99",
"description": "Another example product description"
}
]
}
}
Features Overview
Important Note
Always check the website's robots.txt and terms of service before extracting data.
Key Features
-
Natural Language Prompts
- Describe what data you want to extract
- AI understands and extracts relevant information
- No need for complex selectors or XPath
-
Data Extraction
- Structured JSON output
- Automatic data cleaning
- Smart field detection
-
Configuration Options
- Rate limiting
- Retry logic
- Proxy support
Examples
Basic Extraction
Extract specific information from a webpage:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com",
"prompt": "Extract the main article title and author name"
}
Response:
{
"jobId": "job_456def",
"status": "completed",
"result": {
"title": "Example Article Title",
"author": "John Doe"
}
}
Extracting Multiple Items
Extract a list of items from a webpage:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/blog",
"prompt": "Extract all blog posts including their titles, dates, and summaries"
}
Response:
{
"jobId": "job_789ghi",
"status": "completed",
"result": {
"posts": [
{
"title": "First Blog Post",
"date": "2024-02-14",
"summary": "This is the first blog post summary"
},
{
"title": "Second Blog Post",
"date": "2024-02-13",
"summary": "This is the second blog post summary"
}
]
}
}
Advanced Options
You can include additional options in your request:
POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
{
"url": "https://example.com/products",
"prompt": "Extract all product information",
"options": {
"bypassCache": true,
}
}
Error Responses
Here are common error responses you might encounter:
Invalid Request
HTTP/1.1 400 Bad Request
Content-Type: application/json
{
"error": {
"code": "INVALID_REQUEST",
"message": "URL is required"
}
}
Authentication Error
HTTP/1.1 401 Unauthorized
Content-Type: application/json
{
"error": {
"code": "INVALID_API_KEY",
"message": "Invalid or missing API key"
}
}
Best Practices
-
Writing Effective Prompts
- Be specific about what data you want
- Include field names in your prompt
- Specify the format you expect
-
Error Handling
- Always check job status
- Implement retry logic for rate limits
- Handle errors gracefully
-
Performance
- Use caching when possible
- Implement proper rate limiting
- Monitor API usage