Robots.txt Common Mistakes
Robots.txt is simple enough to edit by hand and risky enough to break a site when edited carelessly. Most problems come from blocking too much, forgetting the sitemap, or assuming robots.txt removes indexed URLs by itself.
Mistake 1: Blocking important folders
If you block folders that contain assets or public pages, search engines may struggle to render or discover the pages correctly.
Mistake 2: Using robots.txt to remove indexed pages
Blocking crawl access does not guarantee removal from search results. Use noindex or proper removal workflows when the page should disappear.
Mistake 3: Forgetting the sitemap line
A sitemap directive is optional, but it helps search engines find the URLs you do want crawled and indexed.