API Reference
Complete reference for the PageSight API endpoints, parameters, and responses.
GET /api/scrape
Extract webpage metadata and content based on specified categories. All requests must include the params query parameter to specify which categories to extract, or to hit the cache for previously scraped data.
https://pagesight.com/api/scrapeQuery Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | The URL of the webpage to analyze (must be HTTP/HTTPS) |
params | string | Yes | Required. Comma-separated list of categories to extract. Used to determine cache key. If not provided, request will fail. Valid categories: metadata, openGraph, twitterCard, favicon, images, and more... |
format | string | No | Response format: json (default) or toon |
cacheTime | number | No | Cache duration in minutes (Pro users only, minimum 5 minutes) |
revalidate | boolean | No | Force cache refresh by setting revalidate=true (Pro users only) |
The params parameter is required for all requests. It serves two purposes:
- Cache Key Generation: The combination of
url+params+formatcreates a unique cache key - Data Extraction: Specifies which categories of data to extract from the webpage
Example: ?url=https://example.com¶ms=metadata,openGraph
Valid Categories
The following categories can be used in the params parameter (comma-separated):
metadataopenGraphtwitterCardfaviconimagesrobotssitemapcontentstructuredDatatechnicalmobileViewdesktopViewperformanceaccessibilitysecuritysocialanalyticslinksformsmediatechStackinfrastructureBasic Request
A simple request to extract basic metadata. Note: params is required.
Request with Specific Categories
Extract specific data categories from a webpage. Multiple categories can be specified.
Custom Cache Duration (Pro Only)
Set a custom cache duration for your request. Minimum 5 minutes.
Cache Revalidation (Pro Only)
Force a fresh scrape by bypassing the cache.
Response Format
Successful responses return JSON with the following structure.
Response Headers
| Header | Description |
|---|---|
X-Cache | Cache status: HIT or MISS |
X-Cache-Expires-At | ISO timestamp when cache expires |
X-RateLimit-Limit | Your rate limit (requests per minute) |
X-RateLimit-Remaining | Remaining requests in current window |
X-RateLimit-Reset | ISO timestamp when rate limit resets |
PageSight