API Documentation

Complete reference for the FineData Scraper API

Base URL

https://api.finedata.ai

Quick Start

Scrape any page in 30 seconds. Get your API key from the dashboard.

curl

curl -X POST https://api.finedata.ai/api/v1/scrape \
  -H "Content-Type: application/json" \
  -H "x-api-key: fd_your_api_key" \
  -d '{"url": "https://example.com", "formats": ["markdown"]}'

Python

import requests

resp = requests.post(
    "https://api.finedata.ai/api/v1/scrape",
    headers={"x-api-key": "fd_your_api_key"},
    json={"url": "https://example.com", "formats": ["markdown"]}
)
data = resp.json()
print(data["data"]["markdown"])

JavaScript

const resp = await fetch("https://api.finedata.ai/api/v1/scrape", {
  method: "POST",
  headers: { "Content-Type": "application/json", "x-api-key": "fd_your_api_key" },
  body: JSON.stringify({ url: "https://example.com", formats: ["markdown"] })
});
const data = await resp.json();
console.log(data.data.markdown);

Authentication

All requests require one of:

API Key (recommended)

curl -H "x-api-key: fd_your_api_key" https://api.finedata.ai/api/v1/scrape

JWT Bearer Token

curl -H "Authorization: Bearer eyJhbG..." https://api.finedata.ai/api/v1/scrape

Token Pricing

Each request costs tokens based on features used. Costs are additive:

1
Base request
+2
Antibot (TLS)
+5
JS Render
+7
Stealth Antibot
+15
Stealth New
+25
Stealth Headful
+3
Residential
+4
Mobile
+10
Captcha
+1
Per retry
Examples: Simple HTTP = 1 · With antibot = 3 · JS render = 8 · Stealth + residential = 13 · Stealth Headful = 28
Stealth modes are mutually exclusive. Proxy and captcha stack on top.

Scrape Endpoint

POST /api/v1/scrape

Request Body

{
  "url": "https://example.com",       // Required
  "method": "GET",                     // Optional, default: GET
  "headers": {},                       // Optional - custom headers
  "tls_profile": "chrome136",          // Browser fingerprint
  "max_retries": 5,                    // 1-10, default: 5
  "timeout": 30,                       // 5-120 sec, default: 30

  // Stealth modes (mutually exclusive)
  "use_antibot": true,                 // TLS fingerprinting (+2)
  "use_js_render": false,              // Playwright JS rendering (+5)
  "stealth_antibot": false,            // Cloudflare, DataDome (+7)
  "stealth_antibot_headful": false,    // Maximum bypass (+25)
  "stealth_new": false,                // Experimental (+15)

  // Proxy (mutually exclusive)
  "use_residential": false,            // Residential (+3)
  "use_mobile": false,                 // Mobile (+4)

  // JS options (require JS rendering or stealth)
  "js_wait_for": "networkidle",        // networkidle, load, domcontentloaded, selector:.css
  "js_scroll": false,                  // Infinite scroll for lazy-load

  // Captcha & output
  "solve_captcha": false,              // Auto-solve (+10)
  "formats": ["markdown"],             // markdown, rawHtml, text, links, screenshot
  "only_main_content": false,          // Strip nav/footer/ads

  // Extraction
  "extract_rules": {"title": "h1"},    // CSS selectors (+0)
  "extract_prompt": "Extract price",   // AI natural language (+5)
  "extract_schema": {},                // AI JSON Schema (+5)
  "ai_content_mode": "full"            // "full" or "main"
}

Response

{
  "success": true,
  "status_code": 200,
  "headers": {"content-type": "text/html"},
  "body": "<html>...</html>",
  "data": {
    "markdown": "# Title\n...",
    "text": "Title...",
    "links": ["https://..."],
    "extract": {"title": "Product", "ai_extract": {"price": 99.99}},
    "metadata": {"title": "Page Title", "description": "...", "language": "en"}
  },
  "meta": {
    "proxy_type": "datacenter",
    "attempts": 1, "response_time_ms": 342,
    "tls_profile": "chrome136",
    "request_id": "uuid...",
    "js_rendered": false,
    "block_reason": null
  },
  "tokens_used": 3,
  "captcha_detected": false,
  "captcha_solved": false
}

Output Formats

Use the formats array to control what appears in data.

Format Returns Best for
markdownClean MarkdownLLM input, content extraction
textPlain textSearch indexing, NLP
rawHtmlCleaned HTMLHTML parsing
linksArray of URLsCrawling, link analysis
screenshotBase64 PNGVisual verification

AI Extraction

Extract structured data using CSS selectors (free) or AI (+5 tokens).

CSS Selectors (free)

{
  "url": "https://example.com/product",
  "extract_rules": {
    "title": "h1",
    "price": ".product-price",
    "images": "img.product-image@src"
  }
}

AI with Natural Language (+5 tokens)

{
  "url": "https://example.com/product",
  "formats": ["markdown"],
  "extract_prompt": "Extract product name, price, rating, and availability"
}

AI with JSON Schema (+5 tokens)

{
  "url": "https://example.com/product",
  "extract_schema": {
    "type": "object",
    "properties": {
      "name": {"type": "string"},
      "price": {"type": "number"},
      "in_stock": {"type": "boolean"}
    }
  }
}

Async & Batch Scraping

Background processing for large-scale scraping or slow pages.

POST /api/v1/async/scrape

Submit a job, get job_id immediately. Use webhook or poll for results.

{
  "url": "https://heavy-site.com",
  "use_js_render": true,
  "callback_url": "https://your-server.com/webhook"
}
// Response: {"job_id": "550e8400-...", "status": "pending"}
GET /api/v1/async/jobs/{job_id}

Poll status: pending, processing, completed, failed, cancelled

GET /api/v1/async/jobs

List your jobs with pagination

DELETE /api/v1/async/jobs/{job_id}

Cancel a pending or processing job

POST /api/v1/async/batch

Submit up to 100 URLs in one request

Captcha Solving

Set solve_captcha: true to auto-detect and solve (+10 tokens).

reCAPTCHA v2/v3hCaptchaCloudflare TurnstileYandex SmartCaptchaFunCaptchaGeeTest

TLS Profiles

Set tls_profile to mimic a browser fingerprint.

Chrome

chrome136 (default)chrome131chrome124chrome123chrome120

Firefox · Safari · Mobile

firefox133safari184safari18_0safari17_0chrome131_androidsafari18_0_ios

VIP (Premium auto-rotate)

vipvip:iosvip:androidvip:windowsvip:mobile

Error Codes

Code Description Action
401 Invalid or missing API key / JWT token Check your API key
402 Payment required Upgrade plan or pay invoices
403 Account suspended Contact support
422 Validation error Check request parameters
429 Rate limit exceeded Reduce concurrency or upgrade
498 Target site blocked (antibot) Try stealth mode or residential
500 Internal server error Retry or contact support
503 Service unavailable Retry after a moment

MCP Integration

Connect FineData to Claude Desktop, Cursor IDE, and other AI agents via MCP.

{
  "mcpServers": {
    "finedata": {
      "command": "npx",
      "args": ["-y", "@finedata/mcp-server"],
      "env": { "FINEDATA_API_KEY": "fd_your_api_key" }
    }
  }
}

Rate Limits

Plan Concurrent req/s Tokens/month Overage
Free Trial 5 10,000 Blocked
Free 2 1,000 Blocked
PAYG 10 Unlimited $0.55/1K
Personal 20 100,000 $0.55/1K
Team S 50 1,000,000 $0.55/1K
Team M 100 3,000,000 $0.55/1K
Team L 200 5,000,000 $0.55/1K
Scaling 400 10,000,000 $0.55/1K

Free plans block when exhausted. Paid plans allow overage at $0.55/1K tokens.