Gluecrawl

Protection Levels

How Gluecrawl adapts its scraping strategy to match a site's defenses.

Not all websites are equally easy to scrape. Some serve content as plain pages, while others use dynamic loading, bot detection, or other protective measures. Gluecrawl handles this automatically by choosing the right protection level for each site.

The Four Levels

LevelTypical use caseCredits per page
LightStraightforward sites that serve content directly1
ModerateSites that load content dynamically after the page loads2
HeavySites with anti-bot measures that block simple requests3
StrictHeavily protected sites requiring the most thorough approach6

Light

For sites that serve their content immediately when requested. This is the fastest and cheapest level.

Moderate

For sites that rely on dynamic content loading — the page appears blank at first and populates after scripts run. Gluecrawl uses a full browsing environment to wait for the content to appear.

Heavy

For sites that actively detect and block automated requests. Gluecrawl routes requests through external infrastructure to avoid being blocked.

Strict

For the most heavily guarded sites. Combines a full browsing environment with external routing for maximum reliability. This is the slowest and most expensive level, but works on virtually any site.

Automatic Detection

You never need to choose a protection level manually. During the processing step, Gluecrawl tests the site and selects the appropriate level. The result is saved with the job so subsequent runs use the same level without re-testing.

Mid-Run Escalation

If a site's defenses change between runs and the current level is no longer sufficient, Gluecrawl automatically escalates to a higher level mid-run. The job's protection level is updated for future runs.

On this page