PageSight

Webpage Metadata Checker API

Extract metadata, Open Graph tags, images, and more from any webpage. Build rich link previews with ready-to-use components for React, Vue, Svelte, and vanilla JavaScript.

Try It Out

Test the API with any URL. No API key required for this demo playground.

API Playground
Enter a URL or choose from predefined links to test metadata extraction
MetadataOpen GraphTech StackInfrastructure

These are free-tier categories available in the playground. Free users have access to 17 categories total.

3 / 3 requests remaining

Powerful Features

Everything you need to extract and display webpage metadata

Rich Metadata Extraction
Extract metadata, Open Graph tags, Twitter Cards, images, and more from any webpage
Framework Components
Ready-to-use components for React, Vue, Svelte, and vanilla JavaScript. Copy-paste and use.
Smart Caching
Built-in caching reduces API calls. Premium users get custom cache duration and revalidation.
Secure API Access
Bearer token authentication with API key management. Rate limiting to ensure fair usage.
Analytics Dashboard
Track your API usage with detailed analytics. Monitor requests, errors, and performance metrics.
Premium Features
Custom cache duration, cache revalidation, higher rate limits, and access to all categories.

Extraction Categories

Extract comprehensive data from any webpage with our 22 specialized categories. Each category is optimized for specific use cases and provides detailed, structured responses.

How It Works
Overview of our extraction capabilities and common use cases

Overview

PageSight extracts comprehensive data from any webpage through intelligent parsing and analysis. Categories are processed efficiently to deliver fast, accurate results. All relative URLs are automatically resolved to absolute URLs, and responses are cached to reduce API calls and improve response times.

Common Use Cases

  • Link Previews: Generate rich previews for social media, messaging apps, and content platforms
  • SEO Analysis: Audit metadata, structured data, and technical SEO elements
  • Content Analysis: Extract and analyze page content, headings, and structure
  • Technology Detection: Identify CMS, frameworks, and tech stack used on websites
  • Accessibility Auditing: Check WCAG compliance and accessibility features
  • Performance Monitoring: Measure page load times and Core Web Vitals
  • Security Auditing: Analyze security headers and HTTPS configuration

Categories

Free
Premium

Metadata

Free

Extracts basic HTML metadata including title, description, keywords, author, viewport settings, and all meta tags from the page head.

Implementation

Parses all <meta> tags, <title> element, canonical links, and HTML lang attributes using Cheerio HTML parsing.

Usage

  • SEO analysis and optimization
  • Content management systems
  • Link preview generation
  • Page title and description extraction

Example Response

Loading code...

cURL Example

Loading code...

Open Graph

Free

Extracts Open Graph protocol tags used for rich social media previews on Facebook, LinkedIn, and other platforms.

Implementation

Parses all <meta property="og:*"> tags and resolves relative URLs to absolute URLs for images and media.

Usage

  • Social media link previews
  • Content sharing platforms
  • Social media management tools
  • Rich link preview cards

Example Response

Loading code...

cURL Example

Loading code...

Twitter Card

Free

Extracts Twitter Card metadata for rich Twitter link previews with images, titles, and descriptions.

Implementation

Parses all <meta name="twitter:*"> tags and handles Twitter Card types (summary, summary_large_image, app, player).

Usage

  • Twitter link previews
  • Social media automation
  • Content marketing tools
  • Twitter analytics

Example Response

Loading code...

cURL Example

Loading code...

Favicon

Free

Extracts all favicon variants including standard favicons, Apple touch icons, and shortcut icons with their sizes and types.

Implementation

Searches for <link rel="icon">, <link rel="apple-touch-icon">, and <link rel="shortcut icon"> tags, and checks for default /favicon.ico.

Usage

  • Favicon detection
  • Brand identification
  • Icon extraction for apps
  • Website analysis tools

Example Response

Loading code...

cURL Example

Loading code...

Images

Premium

Extracts all images from the page including src, alt text, dimensions, loading attributes, and accessibility metrics.

Implementation

Finds all <img> tags, extracts attributes (src, alt, width, height, loading, srcset), and counts images with/without alt text for accessibility analysis.

Usage

  • Image gallery generation
  • Accessibility auditing
  • Content analysis
  • Image optimization tools

Example Response

Loading code...

cURL Example

Loading code...

Robots.txt

Premium

Analyzes robots.txt file to extract crawl rules, disallowed/allowed paths, sitemap locations, and crawl delays.

Implementation

Fetches /robots.txt, parses user-agent rules, disallow/allow directives, sitemap declarations, and crawl-delay settings.

Usage

  • SEO auditing
  • Crawl planning
  • Website analysis
  • Search engine optimization

Example Response

Loading code...

cURL Example

Loading code...

Sitemap

Premium

Extracts XML sitemap data including URLs, last modification dates, change frequencies, priorities, and sitemap indexes.

Implementation

Fetches sitemap.xml (or variants), parses XML structure, extracts URL entries, and handles sitemap index files.

Usage

  • SEO analysis
  • Site structure analysis
  • Crawl planning
  • Content discovery

Example Response

Loading code...

cURL Example

Loading code...

Content

Premium

Analyzes page content structure including headings hierarchy, links (internal/external), text content, word count, and content elements.

Implementation

Extracts all headings (h1-h6), parses all links with internal/external classification, counts paragraphs/lists/blockquotes, and analyzes text content.

Usage

  • Content analysis
  • SEO content auditing
  • Link analysis
  • Content structure evaluation

Example Response

Loading code...

cURL Example

Loading code...

Structured Data

Free

Extracts JSON-LD, Microdata, and RDFa structured data including schema types, properties, and semantic markup.

Implementation

Parses <script type="application/ld+json"> for JSON-LD, [itemtype] attributes for Microdata, and identifies all schema.org types present.

Usage

  • SEO rich snippets
  • Schema validation
  • Semantic web analysis
  • Search engine optimization

Example Response

Loading code...

cURL Example

Loading code...

Technical

Premium

Analyzes technical aspects of the page including HTML version, doctype, element counts, and SEO technical indicators.

Implementation

Analyzes HTML structure, counts scripts/stylesheets/forms, checks for SEO elements (H1, meta description, canonical), and extracts technical metadata.

Usage

  • Technical SEO auditing
  • Website health checks
  • Performance analysis
  • Code quality assessment

Example Response

Loading code...

cURL Example

Loading code...

Mobile Screenshot

Premium

Captures a full-page screenshot of the website rendered in mobile viewport (375x667) for visual analysis.

Implementation

Uses Playwright with mobile viewport settings, waits for page load, and captures full-page PNG screenshot.

Usage

  • Mobile responsiveness testing
  • Visual regression testing
  • Design validation
  • Mobile preview generation

Example Response

Loading code...

cURL Example

Loading code...

Desktop Screenshot

Premium

Captures a full-page screenshot of the website rendered in desktop viewport (1920x1080) for visual analysis.

Implementation

Uses Playwright with desktop viewport settings, waits for fonts and critical rendering, and captures full-page PNG screenshot.

Usage

  • Desktop preview generation
  • Visual documentation
  • Design validation
  • Website archiving

Example Response

Loading code...

cURL Example

Loading code...

Performance

Premium

Measures page performance metrics including load times, resource counts, and Core Web Vitals indicators.

Implementation

Uses Playwright performance API to measure navigation timing, resource timing, and calculates performance scores.

Usage

  • Performance monitoring
  • Speed optimization
  • Core Web Vitals tracking
  • Performance auditing

Example Response

Loading code...

cURL Example

Loading code...

Accessibility

Premium

Analyzes accessibility features including ARIA attributes, semantic HTML usage, alt text coverage, and WCAG compliance indicators.

Implementation

Scans HTML for ARIA attributes, semantic elements, form labels, heading hierarchy, and accessibility best practices.

Usage

  • Accessibility auditing
  • WCAG compliance checking
  • A11y testing
  • Inclusive design validation

Example Response

Loading code...

cURL Example

Loading code...

Security

Premium

Analyzes security headers, HTTPS configuration, content security policy, and security-related meta tags.

Implementation

Extracts HTTP security headers (CSP, HSTS, X-Frame-Options), checks HTTPS configuration, and analyzes security meta tags.

Usage

  • Security auditing
  • Header analysis
  • Security compliance
  • Vulnerability assessment

Example Response

Loading code...

cURL Example

Loading code...

Social Media

Premium

Extracts social media links, sharing buttons, and social platform integrations from the page.

Implementation

Finds social media links (Facebook, Twitter, LinkedIn, etc.), detects sharing widgets, and extracts social meta tags.

Usage

  • Social media analysis
  • Sharing feature detection
  • Social integration auditing
  • Social media marketing

Example Response

Loading code...

cURL Example

Loading code...

Analytics

Premium

Detects analytics and tracking scripts including Google Analytics, Facebook Pixel, and other tracking tools.

Implementation

Scans for analytics script tags, detects common analytics platforms (GA, GTM, Facebook Pixel), and extracts tracking IDs.

Usage

  • Analytics auditing
  • Privacy compliance
  • Tracking detection
  • Marketing tool analysis

Example Response

Loading code...

cURL Example

Loading code...

Links

Premium

Comprehensive link analysis including internal/external links, nofollow attributes, anchor text, and link structure.

Implementation

Extracts all <a> tags, classifies internal vs external, analyzes rel attributes (nofollow, noopener), and extracts anchor text.

Usage

  • Link building analysis
  • SEO link auditing
  • Broken link detection
  • Link structure analysis

Example Response

Loading code...

cURL Example

Loading code...

Forms

Premium

Extracts form elements including input fields, form actions, methods, validation attributes, and form structure.

Implementation

Finds all <form> elements, extracts inputs, selects, textareas, form actions/methods, and analyzes form validation.

Usage

  • Form analysis
  • Contact form detection
  • Form validation auditing
  • User interaction analysis

Example Response

Loading code...

cURL Example

Loading code...

Media

Premium

Extracts media elements including videos, audio files, embedded content, and media metadata.

Implementation

Finds <video>, <audio>, <iframe>, and embedded media elements, extracts sources, dimensions, and media attributes.

Usage

  • Media content analysis
  • Video/audio detection
  • Embedded content auditing
  • Media library generation

Example Response

Loading code...

cURL Example

Loading code...

Tech Stack

Free

Identifies technologies used on the website including CMS, frameworks, libraries, and server technologies.

Implementation

Analyzes HTML comments, script sources, meta tags, and HTTP headers to detect technologies like WordPress, React, Vue, etc.

Usage

  • Technology detection
  • Competitive analysis
  • Tech stack auditing
  • Framework identification

Example Response

Loading code...

cURL Example

Loading code...

Infrastructure

Free

Analyzes server infrastructure including hosting provider, CDN, DNS, SSL certificates, and server headers.

Implementation

Extracts HTTP headers (Server, X-Powered-By), analyzes DNS records, checks SSL certificates, and identifies hosting/CDN providers.

Usage

  • Infrastructure analysis
  • Hosting detection
  • CDN identification
  • Server configuration auditing

Example Response

Loading code...

cURL Example

Loading code...

Note: Free tier users can access 7 categories. Premium users have access to all 22 categories and can request up to 3 categories per API call. All responses are cached for 24 hours by default (Premium: customizable cache duration).

Simple Pricing

Choose the plan that fits your needs

Free
$0/month
  • 10 requests per minute
  • 1 category at a time
  • Analytics dashboard
  • 7 categories available
  • 1 day fixed cache
  • API key management
Premium
Popular
.../month
  • 20 requests per minute
  • Up to 3 categories at a time
  • All 22 categories available
  • Custom cache duration (min 5 min)
  • Cache revalidation control
  • Analytics dashboard
  • Priority support

Ready to Get Started?

Start extracting webpage metadata in minutes. No credit card required.

Webpage Metadata Checker API | PageSight