API Documentation

Complete reference for the FineData Scraper API

Base URL

https://api.finedata.ai

Quick Start

Scrape any page in 30 seconds. Get your API key from the dashboard.

curl

curl -X POST https://api.finedata.ai/api/v1/scrape \
  -H "Content-Type: application/json" \
  -H "x-api-key: fd_your_api_key" \
  -d '{"url": "https://example.com", "formats": ["markdown"]}'

Python

import requests

resp = requests.post(
    "https://api.finedata.ai/api/v1/scrape",
    headers={"x-api-key": "fd_your_api_key"},
    json={"url": "https://example.com", "formats": ["markdown"]}
)
data = resp.json()
print(data["data"]["markdown"])

JavaScript

const resp = await fetch("https://api.finedata.ai/api/v1/scrape", {
  method: "POST",
  headers: { "Content-Type": "application/json", "x-api-key": "fd_your_api_key" },
  body: JSON.stringify({ url: "https://example.com", formats: ["markdown"] })
});
const data = await resp.json();
console.log(data.data.markdown);

Authentication

All requests require one of:

API Key (recommended)

curl -H "x-api-key: fd_your_api_key" https://api.finedata.ai/api/v1/scrape

JWT Bearer Token

curl -H "Authorization: Bearer eyJhbG..." https://api.finedata.ai/api/v1/scrape

Token Pricing

Each request costs tokens based on features used. Costs are additive:

Base request

Antibot (TLS)

JS Render

Stealth Antibot

+15

Stealth New

+25

Stealth Headful

Residential

Mobile

+10

Captcha

Per retry

Examples: Simple HTTP = 1 · With antibot = 3 · JS render = 8 · Stealth + residential = 13 · Stealth Headful = 28
Stealth modes are mutually exclusive. Proxy and captcha stack on top.

Scrape Endpoint

POST /api/v1/scrape

Request Body

{
  "url": "https://example.com",       // Required
  "method": "GET",                     // Optional, default: GET
  "headers": {},                       // Optional - custom headers
  "tls_profile": "chrome136",          // Browser fingerprint
  "max_retries": 5,                    // 1-10, default: 5
  "timeout": 30,                       // 5-120 sec, default: 30

  // Stealth modes (mutually exclusive)
  "use_antibot": true,                 // TLS fingerprinting (+2)
  "use_js_render": false,              // Playwright JS rendering (+5)
  "stealth_antibot": false,            // Cloudflare, DataDome (+7)
  "stealth_antibot_headful": false,    // Maximum bypass (+25)
  "stealth_new": false,                // Experimental (+15)

  // Proxy (mutually exclusive)
  "use_residential": false,            // Residential (+3)
  "use_mobile": false,                 // Mobile (+4)

  // JS options (require JS rendering or stealth)
  "js_wait_for": "networkidle",        // networkidle, load, domcontentloaded, selector:.css
  "js_scroll": false,                  // Infinite scroll for lazy-load

  // Captcha & output
  "solve_captcha": false,              // Auto-solve (+10)
  "formats": ["markdown"],             // markdown, rawHtml, text, links, screenshot
  "only_main_content": false,          // Strip nav/footer/ads

  // Extraction
  "extract_rules": {"title": "h1"},    // CSS selectors (+0)
  "extract_prompt": "Extract price",   // AI natural language (+5)
  "extract_schema": {},                // AI JSON Schema (+5)
  "ai_content_mode": "full"            // "full" or "main"
}

Response

{
  "success": true,
  "status_code": 200,
  "headers": {"content-type": "text/html"},
  "body": "<html>...</html>",
  "data": {
    "markdown": "# Title\n...",
    "text": "Title...",
    "links": ["https://..."],
    "extract": {"title": "Product", "ai_extract": {"price": 99.99}},
    "metadata": {"title": "Page Title", "description": "...", "language": "en"}
  },
  "meta": {
    "proxy_type": "datacenter",
    "attempts": 1, "response_time_ms": 342,
    "tls_profile": "chrome136",
    "request_id": "uuid...",
    "js_rendered": false,
    "block_reason": null
  },
  "tokens_used": 3,
  "captcha_detected": false,
  "captcha_solved": false
}

Output Formats

Use the formats array to control what appears in data.

Format	Returns	Best for
`markdown`	Clean Markdown	LLM input, content extraction
`text`	Plain text	Search indexing, NLP
`rawHtml`	Cleaned HTML	HTML parsing
`links`	Array of URLs	Crawling, link analysis
`screenshot`	Base64 PNG	Visual verification

AI Extraction

Extract structured data using CSS selectors (free) or AI (+5 tokens).

CSS Selectors (free)

{
  "url": "https://example.com/product",
  "extract_rules": {
    "title": "h1",
    "price": ".product-price",
    "images": "img.product-image@src"
  }
}

AI with Natural Language (+5 tokens)

{
  "url": "https://example.com/product",
  "formats": ["markdown"],
  "extract_prompt": "Extract product name, price, rating, and availability"
}

AI with JSON Schema (+5 tokens)

{
  "url": "https://example.com/product",
  "extract_schema": {
    "type": "object",
    "properties": {
      "name": {"type": "string"},
      "price": {"type": "number"},
      "in_stock": {"type": "boolean"}
    }
  }
}

Async & Batch Scraping

Background processing for large-scale scraping or slow pages.

POST /api/v1/async/scrape

Submit a job, get job_id immediately. Use webhook or poll for results.

{
  "url": "https://heavy-site.com",
  "use_js_render": true,
  "callback_url": "https://your-server.com/webhook"
}
// Response: {"job_id": "550e8400-...", "status": "pending"}

GET /api/v1/async/jobs/{job_id}

Poll status: pending, processing, completed, failed, cancelled

GET /api/v1/async/jobs

List your jobs with pagination

DELETE /api/v1/async/jobs/{job_id}

Cancel a pending or processing job

POST /api/v1/async/batch

Submit up to 100 URLs in one request

Captcha Solving

Set solve_captcha: true to auto-detect and solve (+10 tokens).

reCAPTCHA v2/v3hCaptchaCloudflare TurnstileYandex SmartCaptchaFunCaptchaGeeTest

TLS Profiles

Set tls_profile to mimic a browser fingerprint.

Chrome

chrome136 (default)chrome131chrome124chrome123chrome120

Firefox · Safari · Mobile

firefox133safari184safari18_0safari17_0chrome131_androidsafari18_0_ios

VIP (Premium auto-rotate)

vipvip:iosvip:androidvip:windowsvip:mobile

Error Codes

Code	Description	Action
`401`	Invalid or missing API key / JWT token	Check your API key
`402`	Payment required	Upgrade plan or pay invoices
`403`	Account suspended	Contact support
`422`	Validation error	Check request parameters
`429`	Rate limit exceeded	Reduce concurrency or upgrade
`498`	Target site blocked (antibot)	Try stealth mode or residential
`500`	Internal server error	Retry or contact support
`503`	Service unavailable	Retry after a moment

MCP Integration

Connect FineData to Claude Desktop, Cursor IDE, and other AI agents via MCP.

{
  "mcpServers": {
    "finedata": {
      "command": "npx",
      "args": ["-y", "@finedata/mcp-server"],
      "env": { "FINEDATA_API_KEY": "fd_your_api_key" }
    }
  }
}

PyPI NPM

Rate Limits

Plan	Concurrent req/s	Tokens/month	Overage
Free Trial	5	10,000	Blocked
Free	2	1,000	Blocked
PAYG	10	Unlimited	$0.55/1K
Personal	20	100,000	$0.55/1K
Team S	50	1,000,000	$0.55/1K
Team M	100	3,000,000	$0.55/1K
Team L	200	5,000,000	$0.55/1K
Scaling	400	10,000,000	$0.55/1K

Free plans block when exhausted. Paid plans allow overage at $0.55/1K tokens.