Guide Page

Robots.txt Common Mistakes

Robots.txt is simple enough to edit by hand and risky enough to break a site when edited carelessly. Most problems come from blocking too much, forgetting the sitemap, or assuming robots.txt removes indexed URLs by itself.

Mistake 1: Blocking important folders

If you block folders that contain assets or public pages, search engines may struggle to render or discover the pages correctly.

Mistake 2: Using robots.txt to remove indexed pages

Blocking crawl access does not guarantee removal from search results. Use noindex or proper removal workflows when the page should disappear.

Mistake 3: Forgetting the sitemap line

A sitemap directive is optional, but it helps search engines find the URLs you do want crawled and indexed.