About PixelHarborBot

PixelHarborBot is the web crawler used by Pixel Harbor to help website owners scan, analyze, and optimize their images.

What does PixelHarborBot do?

When a Pixel Harbor user requests a scan of their website, PixelHarborBot:

  • Visits pages on the website to discover images
  • Downloads images for analysis and optimization
  • Respects robots.txt rules and crawl-delay directives
  • Identifies itself clearly in the User-Agent header

PixelHarborBot is designed to be polite and respectful. It rate-limits requests, follows robots.txt rules, and only crawls sites that Pixel Harbor users have explicitly requested to scan.

Identifying PixelHarborBot

You can identify our crawler by its User-Agent string:

PixelHarborBot/1.0 (+https://pixelharbor.io/bot; image optimization service)

Whitelisting

PixelHarborBot operates from AWS infrastructure in the EU (Frankfurt, eu-central-1). We recommend identifying our bot by User-Agent string rather than IP address. For detailed whitelisting instructions, see our identification guide.

How to allow PixelHarborBot

If you're a Pixel Harbor user and our crawler is being blocked by your site's security settings, you have several options:

Recommended: Allow by User-Agent

Add a rule to allow the PixelHarborBot user agent in your security settings. This is the most reliable method as our IP addresses are dynamic.

Platform-specific guides

See our identification guide for detailed instructions for:

  • Cloudflare: WAF custom rule for User-Agent
  • Nginx: User-Agent map configuration
  • Apache: .htaccess SetEnvIfNoCase directive
  • WordPress + Wordfence: User-Agent whitelist
  • AWS WAF: String match condition

How to block PixelHarborBot

If you don't want PixelHarborBot to crawl your site, you can block it using robots.txt:

User-agent: PixelHarborBot
Disallow: /

You can also block specific paths:

User-agent: PixelHarborBot
Disallow: /private/
Disallow: /admin/

Crawl rate

PixelHarborBot respects the Crawl-delay directive in robots.txt. By default, it waits at least 1 second between requests to the same domain.

User-agent: PixelHarborBot
Crawl-delay: 5

Questions or concerns?

If you have questions about PixelHarborBot or need to report an issue, please contact us:

Quick Reference

User-AgentPixelHarborBot/1.0
RegionEU (Frankfurt)
Default rate1 request/second per domain
Respects robots.txtYes
Whitelisting guide/bot/whitelist
About PixelHarborBot - Our Web Crawler | Pixel Harbor