ai.robots.txt

This is an open list of web crawlers associated with AI companies and the training of LLMs to block. We encourage you to contribute to and implement this list on your own site. See information about the listed crawlers and the FAQ.

A number of these crawlers have been sourced from Dark Visitors and we appreciate the ongoing effort they put in to track these crawlers.

If you'd like to add information about a crawler to the list, please make a pull request with the bot name added to robots.txt, ai.txt, and any relevant details in table-of-bot-metrics.md to help people understand what's crawling.

Usage

This repository provides the following files:

robots.txt
.htaccess

robots.txt implements the Robots Exclusion Protocol (RFC 9309).

.htaccess may be used to configure web servers such as Apache httpd to return an error page when one of the listed AI crawlers sends a request to the web server. Note that, as stated in the httpd documentation, more performant methods than an .htaccess file exist.

Contributing

A note about contributing: updates should be added/made to robots.json. A GitHub action will then generate the updated robots.txt, table-of-bot-metrics.md, and .htaccess.

Subscribe to updates

You can subscribe to list updates via RSS/Atom with the releases feed:

https://github.com/ai-robots-txt/ai.robots.txt/releases.atom

You can subscribe with Feedly, Inoreader, The Old Reader, Feedbin, or any other reader app.

Alternatively, you can also subscribe to new releases with your GitHub account by clicking the ⬇️ on "Watch" button at the top of this page, clicking "Custom" and selecting "Releases".

Report abusive crawlers

If you use Cloudflare's hard block alongside this list, you can report abusive crawlers that don't respect robots.txt here.

Additional resources

Blocking Bots with Nginx by Robb Knight
Blockin' bots. by Ethan Marcotte
Blocking Bots With 11ty And Apache by fLaMEd fury
Blockin' bots on Netlify by Jeremia Kimelman
Blocking AI web crawlers by Glyn Normington
Block AI Bots from Crawling Websites Using Robots.txt by Jonathan Gillham, Originality.AI

Name		Name	Last commit message	Last commit date
Latest commit History 331 Commits
.github/workflows		.github/workflows
assets/images		assets/images
code		code
.gitignore		.gitignore
.htaccess		.htaccess
FAQ.md		FAQ.md
LICENSE		LICENSE
README.md		README.md
robots.json		robots.json
robots.txt		robots.txt
table-of-bot-metrics.md		table-of-bot-metrics.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ai.robots.txt

Usage

Contributing

Subscribe to updates

Report abusive crawlers

Additional resources

About

Releases 24

Contributors 23

Languages

License

ai-robots-txt/ai.robots.txt

Folders and files

Latest commit

History

Repository files navigation

ai.robots.txt

Usage

Contributing

Subscribe to updates

Report abusive crawlers

Additional resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 24

Contributors 23

Languages