Scrape Endpoint

The main API endpoint for scraping and extracting metadata from web pages.

Endpoint

GEThttps://pagesight.com/api/scrape

This endpoint accepts GET requests with query parameters to specify the URL to scrape and the categories of data to extract.

Authentication

All requests to the scrape endpoint require authentication using a Bearer token in the Authorization header.

Loading code...

Basic Usage

The simplest request requires only the URL and the params parameter.

Loading code...

Requesting Multiple Categories

You can request multiple categories by passing them as a comma-separated list in the params parameter.

Loading code...

Custom Cache Duration

Premium users can specify a custom cache duration using the cacheTime parameter (minimum 5 minutes).

Loading code...

Cache Revalidation

Premium users can force a fresh scrape by adding the revalidate parameter.

Loading code...

Response Headers

The API returns useful headers with each response:

Headers
  • X-Cache - Indicates if the response was served from cache (HIT) or freshly scraped (MISS)
  • X-Cache-Expires-At - Timestamp when the cache entry expires
  • Cache-Control - Standard HTTP cache control header
  • Retry-After - Present on 429 responses, indicates when to retry
PageSight | PageSight