Protection Levels
How Gluecrawl adapts its scraping strategy to match a site's defenses.
Not all websites are equally easy to scrape. Some serve content as plain pages, while others use dynamic loading, bot detection, or other protective measures. Gluecrawl handles this automatically by choosing the right protection level for each site.
The Four Levels
| Level | Typical use case | Credits per page |
|---|---|---|
| Light | Straightforward sites that serve content directly | 1 |
| Moderate | Sites that load content dynamically after the page loads | 2 |
| Heavy | Sites with anti-bot measures that block simple requests | 3 |
| Strict | Heavily protected sites requiring the most thorough approach | 6 |
Light
For sites that serve their content immediately when requested. This is the fastest and cheapest level.
Moderate
For sites that rely on dynamic content loading — the page appears blank at first and populates after scripts run. Gluecrawl uses a full browsing environment to wait for the content to appear.
Heavy
For sites that actively detect and block automated requests. Gluecrawl routes requests through external infrastructure to avoid being blocked.
Strict
For the most heavily guarded sites. Combines a full browsing environment with external routing for maximum reliability. This is the slowest and most expensive level, but works on virtually any site.
Automatic Detection
You never need to choose a protection level manually. During the processing step, Gluecrawl tests the site and selects the appropriate level. The result is saved with the job so subsequent runs use the same level without re-testing.
Mid-Run Escalation
If a site's defenses change between runs and the current level is no longer sufficient, Gluecrawl automatically escalates to a higher level mid-run. The job's protection level is updated for future runs.