Gluecrawl

Detail Pages

How Gluecrawl extracts additional data from pages linked within a listing.

Many websites organize data across two layers:

  • A listing page that shows a summary of many items (e.g., a product catalog, a directory of businesses, a list of job postings).
  • A detail page for each item that contains the full information (e.g., the individual product page with specs, reviews, and images).

Gluecrawl can follow those links and extract fields from both layers in a single job.

How It Works

During the processing step, Gluecrawl's AI analyzes the listing page and determines whether each item links to a detail page with additional information. If it does, Gluecrawl automatically:

  1. Identifies which fields are available only on the detail page
  2. Learns how to navigate to each item's detail page
  3. Includes those extra fields in the extraction

You don't need to configure this — it is discovered and set up automatically.

Credit Cost

Visiting detail pages requires additional page loads, so they carry an extra credit cost on top of the listing page rate. The exact cost depends on the protection level in use. See Credits for the full rate table.

Graceful Degradation

If a detail page cannot be reached (e.g., the link is broken or the page times out), Gluecrawl still returns the item with whatever data was available from the listing page. You won't lose listing data because of a detail page failure.

Hidden Content

Some detail pages hide information behind interactive elements — for example, a "Show phone number" button or an expandable description. Gluecrawl detects and interacts with these elements automatically before extracting the data.

On this page