How to Block OpenAI GPT from Copying Your Content?

Block OpenAI IPs with Web Application Firewalls

A WAF can help identify and block traffic based on IPs, including OpenAI's IP crawlers. Adding those following IPs will block OpenAI crawlers from reading (and copying) your content.

{
  "prefixes": [
    {
      "ipv4Prefix": "52.230.152.0/24"
    },
    {
      "ipv4Prefix": "52.233.106.0/24"
    }
  ]
}

Block OpenAI in your robots.txt File

While a robots.txt file is not a security measure per se, it tells well-behaved web crawlers which parts of your site they should not access. Adding directives to disallow unsolicited GPT user-agents to access your website:

User-agent: GPTBot
Disallow: /

Why would you block OpenAI to access your websites?

Protect your original content

One of the primary concerns for website owners is content protection. Your website content, whether it's articles, blogs, or proprietary information, represents valuable intellectual property. Allowing AI crawlers unrestricted access can lead to your content being repurposed without your consent, potentially diluting its uniqueness and value. By blocking these crawlers, you maintain control over how and where your content is used.

SEO ranking

Second would be Search engine optimization (SEO). AI-generated content, while useful in many contexts, can sometimes flood the internet with repetitive or low-quality material. If AI models like GPT use your content to train and then generate similar content, it could inadvertently harm your SEO rankings by creating a plethora of competing pages that dilute the originality of your site's content.


👉 You might also be interested in: LLMs Agents with OpenAI's GPT-4 Can Now Hack Your Website - How to Protect Your Website?