Scrape Endpoint

The main API endpoint for scraping and extracting metadata from web pages.

Endpoint

GEThttps://pagesight.com/api/scrape

This endpoint accepts GET requests with query parameters to specify the URL to scrape and the categories of data to extract.

All requests to the scrape endpoint require authentication using a Bearer token in the Authorization header.

Loading code...

The simplest request requires only the URL and the params parameter.

Loading code...

You can request multiple categories by passing them as a comma-separated list in the params parameter.

Loading code...

Premium users can specify a custom cache duration using the cacheTime parameter (minimum 5 minutes).

Loading code...

Premium users can force a fresh scrape by adding the revalidate parameter.

Loading code...

The API returns useful headers with each response:

Headers

X-Cache - Indicates if the response was served from cache (HIT) or freshly scraped (MISS)
X-Cache-Expires-At - Timestamp when the cache entry expires
Cache-Control - Standard HTTP cache control header
Retry-After - Present on 429 responses, indicates when to retry