TECHNICAL

Is Your Website Blocking ChatGPT?
How to Check in 60 Seconds

Most business websites are accidentally invisible to AI. Your CDN, firewall, or hosting provider might be blocking AI crawlers right now.
By Faneros AI · March 2026 · 7 min read

Most business websites are accidentally invisible to AI. The crawlers that power ChatGPT, Claude, Perplexity, and other AI platforms are being blocked from reading your content — not by a decision you made, but by default settings in your CDN, firewall, or security plugin that don't distinguish between malicious bots and AI crawlers.

The result: when someone asks AI for a recommendation in your industry, AI doesn't evaluate your business and choose someone else. It doesn't know you exist. You're not in the candidate pool. And you'd never know because your website looks perfectly fine to human visitors.

Here's how to check — and what to do about it.

The 60-Second Check

Open a new browser tab. Type your website URL followed by /robots.txt:

https://yoursite.com/robots.txt

You'll see one of three things:

Scenario 1: You see block rules for AI crawlers. Look for lines mentioning GPTBot, ClaudeBot, PerplexityBot, or other AI user agents. If you see Disallow: / next to any of them, that AI platform is blocked from reading your site.

User-agent: GPTBot
Disallow: / # ← ChatGPT is BLOCKED

User-agent: ClaudeBot
Disallow: / # ← Claude is BLOCKED

Scenario 2: You see a blanket block. A line like User-agent: * / Disallow: / blocks everything — including all AI crawlers. This is surprisingly common on sites where the robots.txt was auto-generated and never revisited.

Scenario 3: You get a 404 error. No robots.txt exists. This is actually less dangerous than having one with block rules, since most AI crawlers will attempt to access your site by default. But you're missing an opportunity to explicitly invite them in and direct them to your most important content.

The 8 AI Crawlers That Matter

These are the user agents your robots.txt needs to allow. Block any one and you're invisible on that platform.

User AgentPlatformCompanyWhy It Matters
GPTBotChatGPTOpenAILargest AI platform by users (200M+ weekly)
OAI-SearchBotChatGPT SearchOpenAIPowers ChatGPT's real-time search feature
ClaudeBotClaudeAnthropicFastest-growing enterprise AI platform
PerplexityBotPerplexity AIPerplexityCitation-heavy — links to sources in every response
GooglebotGoogle AI OverviewsGoogleAI answers integrated directly into Google Search
BytespiderGrokxAIAI search integrated with X (Twitter)
BingbotCopilotMicrosoftAI integrated into Windows, Office, Edge
Meta-ExternalAgentMeta AIMetaAI search across Facebook, Instagram, WhatsApp
Block all of them and you don't exist in AI search. This is not an exaggeration. If every AI crawler gets blocked at your robots.txt or CDN, there is zero chance any AI platform recommends your business, regardless of how good your reviews are, how strong your reputation is, or how high you rank on Google.

The 3 Biggest Culprits

1. Cloudflare Bot Fight Mode

Cloudflare's "Bot Fight Mode" was designed to protect websites from malicious automated traffic — scrapers, DDoS bots, credential stuffers. It's a good security feature for its intended purpose. The problem: it treats all unknown bots the same way, and AI crawlers are still "unknown" to many Cloudflare configurations.

When Bot Fight Mode is enabled, GPTBot receives a 403 Forbidden response or a JavaScript challenge page. ClaudeBot gets the same treatment. PerplexityBot is blocked. From the perspective of every major AI platform, your website returns an error instead of content. This is the single most common reason businesses are invisible to AI, and it affects millions of websites that use Cloudflare's free and Pro plans.

The fix: create custom WAF rules in Cloudflare that explicitly allow known AI crawler user agents. This takes about 10 minutes and doesn't reduce your protection against actual malicious bots.

2. WordPress Security Plugins

Wordfence, Sucuri, iThemes Security, and similar WordPress security plugins protect your site by blocking suspicious traffic. Since AI crawlers are relatively new, they often get caught in these filters — blocked as "unknown bots" or flagged as potential scrapers. The plugins are doing their job; the problem is that their definition of "suspicious" hasn't been updated to account for legitimate AI crawlers.

Check your security plugin's bot blocking settings. Look for options to whitelist specific user agents. Add GPTBot, ClaudeBot, and PerplexityBot to your allow list.

3. Legacy robots.txt Files

Many websites have robots.txt files that were created years ago — often auto-generated by a CMS or website builder — and never updated. These files frequently contain overly broad block rules that were appropriate in 2018 but actively harmful in 2026. A single User-agent: * / Disallow: / directive, intended to keep test environments or staging sites out of Google's index, can make your entire site invisible to every AI platform.

What Happens When You're Blocked

🚫
You're not in the candidate pool
AI doesn't evaluate you and choose someone else. It doesn't know you exist. You're absent from the conversation entirely.
📉
Compounding disadvantage
Visible businesses get recommended, generating engagement signals that make AI recommend them more. The gap widens every day.

The compounding effect deserves emphasis. AI platforms learn from patterns. When a business is consistently visible and generates positive engagement, AI develops higher confidence in recommending it. The businesses that are visible early build a feedback loop that late entrants have to fight against. Being blocked for six months while your competitors are visible for six months doesn't just cost you six months of leads — it costs you the compounding advantage your competitors built during that time.

Beyond Bot Access: The Full Picture

Unblocking AI crawlers is necessary but not sufficient. It's the first step — the gateway factor without which nothing else matters. But even after fixing robots.txt and CDN settings, AI still needs structured data to understand your business, security signals to trust your site, fast response times to complete its crawl, and properly structured content to extract and cite.

That's why Faneros runs a comprehensive 9-point audit — not just a robots.txt check. We scan crawler access for all 8 user agents, structured data quality and GEO-relevant schema fields, llms.txt presence, sitemap configuration, security header coverage, page speed and TTFB, JavaScript rendering dependencies, content structure, and cross-platform AI response testing.

What to Do Right Now

1

Check your robots.txt

The 60-second test above. Look for Disallow rules that block AI crawlers. If you find them, you've identified the single highest-impact fix available to you.

2

Check your CDN/security settings

If you use Cloudflare, check Bot Fight Mode and your WAF rules. If you use a WordPress security plugin, check its bot blocking configuration. Create explicit allow rules for GPTBot, ClaudeBot, PerplexityBot, and other AI crawler user agents.

3

Run a free Faneros scan

Get the complete picture — including issues you can't check manually, like whether AI platforms actually return content when they request your pages, whether your schema is sufficient for AI recommendation decisions, and how you compare to competitors on AI visibility.

See Your AI Visibility Score

Faneros scans 7 AI platforms in 60 seconds. Find out if ChatGPT, Claude, and Perplexity can see your business.

Scan My Site →