Command Palette

Search for a command to run...

MCP Server Integration

Connect Scrapezy to Claude Desktop, Cursor, and other MCP-compatible AI tools for seamless data extraction


The Scrapezy MCP (Model Context Protocol) Server allows you to integrate Scrapezy's powerful web scraping and data extraction capabilities directly into AI tools like Claude Desktop and Cursor. This is a remote server that you connect to - no local installation required.

What is MCP?

The Model Context Protocol (MCP) is an open standard that enables AI applications to securely connect to external data sources and services. The Scrapezy MCP Server provides AI tools with direct access to:

  • Data Extraction: Extract structured data from any public website
  • Scraper Management: Trigger and monitor configured scrapers
  • Job Monitoring: Track extraction progress and results
  • Result Retrieval: Get extracted data in structured formats

Quick Setup

1. Authentication Setup

The Scrapezy MCP Server uses OAuth 2.0 for secure authentication. No manual setup required - when you first use the MCP tools, you'll be prompted to authenticate via your browser.

2. OAuth Configuration

For clients that support OAuth, you are able to add the following to the configuration.

{
  "mcpServers": {
    "scrapezy": {
      "url": "http://mcp.scrapezy.com"
    }
  }
}

4. Restart and Test

  1. Restart Claude Desktop or Cursor
  2. Start a new conversation
  3. Try: "Extract the headlines from https://news.bbc.co.uk"
  4. When prompted, complete the OAuth authentication in your browser

Available Tools

The remote MCP server provides these tools to AI applications:

start_extraction

Extract structured data from any public website using natural language prompts.

Example prompt to AI: "Extract product names and prices from https://example-store.com/products"

trigger_scraper

Run a preconfigured scraper by its ID.

Example prompt to AI: "Run scraper ID 'scraper_abc123' to get the latest data"

get_job_status

Check the progress of any running extraction or scraping job.

Example prompt to AI: "Check the status of job 'job_xyz789'"

get_job_results

Retrieve results from completed extraction or scraping jobs.

Example prompt to AI: "Get the results from job 'job_xyz789'"

get_recent_scraper_results

Get the most recent results from a specific scraper without knowing job IDs.

Example prompt to AI: "Get the 5 most recent results from scraper 'scraper_abc123'"

Usage Examples

Basic Data Extraction

Simply describe what data you want to extract:

You: "Extract the top 10 news headlines from https://techcrunch.com along with their publication dates"

The AI will use the remote MCP server to:

  1. Call start_extraction with the URL and your request
  2. Monitor the job progress with get_job_status
  3. Retrieve and format the results with get_job_results

E-commerce Price Monitoring

You: "Get current prices for all products on https://competitor.com/products and compare them to our pricing"

The AI will extract the pricing data and can help you analyze competitive positioning.

Content Aggregation

You: "Extract all blog post titles, authors, and summaries from https://company-blog.com and create a content calendar"

The AI will gather the content data and help organize it into a useful format.

Using Existing Scrapers

You: "Run my 'daily-news' scraper and summarize today's top stories"

If you have preconfigured scrapers, the AI can trigger them and process the results.

Server Details

Remote Server URL

  • Production: https://mcp.scrapezy.com

Authentication

The MCP server uses OAuth 2.0 for secure authentication. When you first connect, you'll be redirected to Scrapezy's login page to authorize the connection. The OAuth flow will automatically handle token management and refresh.

Troubleshooting

Common Issues

"Authentication failed"

  • Complete the OAuth authorization flow in your browser
  • Ensure you have the required permissions for the requested scopes
  • Check that OAuth tokens haven't expired

"Connection timeout"

  • Check your internet connection
  • Verify the server URL is correct (https://mcp.scrapezy.com)
  • Ensure your firewall allows HTTPS connections

"Tool execution failed"

  • Check that you have sufficient credits in your Scrapezy account
  • Verify the target website is publicly accessible
  • Review error messages for specific issues

Getting Help

If you encounter issues:

  1. Check the troubleshooting guide
  2. Contact support through the help center

Best Practices

Security

  • OAuth tokens are automatically managed and refreshed
  • Tokens are short-lived for enhanced security
  • Review and revoke OAuth applications as needed in your dashboard
  • Monitor OAuth usage and active connections

Performance

  • Use specific, clear prompts for better extraction accuracy
  • Leverage existing scrapers for recurring data extraction needs
  • Monitor your credit usage and set up alerts

Integration

  • Start with simple extractions to test your setup
  • Gradually build more complex workflows
  • Document your common extraction patterns for reuse

Next Steps