Basic Usage

Learn how to use the API for basic data extraction tasks


Basic Usage Guide

This guide will walk you through the basic usage of our API for data extraction tasks.

Quick Example

Here's a simple example of how to extract data from a webpage:

POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
 
{
  "url": "https://example.com/products",
  "prompt": "Extract all product information from this page including names, prices, and descriptions"
}

Response:

{
  "jobId": "job_123abc",
  "status": "pending",
  "estimatedTime": 10
}

Check job status:

GET https://scrapezy.com/api/v1/extract/job_123abc
x-api-key: your_api_key

Response when complete:

{
  "status": "completed",
  "result": {
    "products": [
      {
        "name": "Example Product 1",
        "price": "$99.99",
        "description": "This is an example product description"
      },
      {
        "name": "Example Product 2",
        "price": "$149.99",
        "description": "Another example product description"
      }
    ]
  }
}

Features Overview

Key Features

  1. Natural Language Prompts

    • Describe what data you want to extract
    • AI understands and extracts relevant information
    • No need for complex selectors or XPath
  2. Data Extraction

    • Structured JSON output
    • Automatic data cleaning
    • Smart field detection
  3. Configuration Options

    • Rate limiting
    • Retry logic
    • Proxy support

Examples

Basic Extraction

Extract specific information from a webpage:

POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
 
{
  "url": "https://example.com",
  "prompt": "Extract the main article title and author name"
}

Response:

{
  "jobId": "job_456def",
  "status": "completed",
  "result": {
    "title": "Example Article Title",
    "author": "John Doe"
  }
}

Extracting Multiple Items

Extract a list of items from a webpage:

POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
 
{
  "url": "https://example.com/blog",
  "prompt": "Extract all blog posts including their titles, dates, and summaries"
}

Response:

{
  "jobId": "job_789ghi",
  "status": "completed",
  "result": {
    "posts": [
      {
        "title": "First Blog Post",
        "date": "2024-02-14",
        "summary": "This is the first blog post summary"
      },
      {
        "title": "Second Blog Post",
        "date": "2024-02-13",
        "summary": "This is the second blog post summary"
      }
    ]
  }
}

Advanced Options

You can include additional options in your request:

POST https://scrapezy.com/api/extract
Content-Type: application/json
x-api-key: your_api_key
 
{
  "url": "https://example.com/products",
  "prompt": "Extract all product information",
  "options": {
    "bypassCache": true,
  }
}

Error Responses

Here are common error responses you might encounter:

Invalid Request

HTTP/1.1 400 Bad Request
Content-Type: application/json
 
{
  "error": {
    "code": "INVALID_REQUEST",
    "message": "URL is required"
  }
}

Authentication Error

HTTP/1.1 401 Unauthorized
Content-Type: application/json
 
{
  "error": {
    "code": "INVALID_API_KEY",
    "message": "Invalid or missing API key"
  }
}

Best Practices

  1. Writing Effective Prompts

    • Be specific about what data you want
    • Include field names in your prompt
    • Specify the format you expect
  2. Error Handling

    • Always check job status
    • Implement retry logic for rate limits
    • Handle errors gracefully
  3. Performance

    • Use caching when possible
    • Implement proper rate limiting
    • Monitor API usage

Next Steps