How to Create a Robots.txt File (Step-by-Step)
Quick answer: create a plain text file named robots.txt, add a User-agent: * line, your Allow/Disallow rules, and a Sitemap: line, then upload it to your domain's root so it loads at yourdomain.com/robots.txt. Or skip the syntax entirely with our free robots.txt generator.
Step 1: Understand what robots.txt does (and doesn't do)
Robots.txt tells crawlers which URLs they may request. It manages crawl traffic — it does not hide pages from search results. A disallowed page can still be indexed if other sites link to it. To keep a page out of Google, use a noindex meta tag or password protection instead. Getting this distinction wrong is the most common robots.txt mistake.
Step 2: Write the file
A minimal, correct robots.txt for most sites is just three lines:
To block specific sections, add Disallow lines with the path prefix. A typical small-business or blog setup:
Rules are case-sensitive and paths must start with /. If you'd rather not hand-write syntax, our robots.txt generator builds the file from checkboxes and gives you a download.
Step 3 (optional): Block AI training crawlers
To keep your content out of AI training datasets, add a group per AI crawler. The most common are GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended (Gemini training — blocking it does not affect your Google Search ranking), CCBot (Common Crawl), and PerplexityBot:
Consider the trade-off: blocking AI crawlers also reduces your chances of being cited in AI answer engines, which are a growing traffic source. The generator adds all five with one checkbox if you decide to block.
Step 4: Upload to your domain root
The file must live at the root — https://yourdomain.com/robots.txt, lowercase, exactly that name. Locations like /blog/robots.txt are ignored. On most hosting, upload via the file manager or FTP into the public_html (or equivalent) folder. On WordPress, use your SEO plugin's robots.txt editor instead: Rank Math and Yoast both have one under Tools.
Step 5: Test it
Open yourdomain.com/robots.txt in a browser first — if you can see it, crawlers can. Then verify in Google Search Console under Settings → robots.txt, which shows when Google last fetched the file and flags syntax errors. Changes take effect on the next crawl, usually within 24 hours.
Mistakes that can wreck your SEO
| Mistake | Consequence |
|---|---|
Disallow: / under User-agent: * | Blocks your entire site from crawling — rankings decay over weeks |
| Blocking CSS/JS folders | Google can't render your pages properly, hurting mobile evaluation |
| Using robots.txt to "hide" private pages | The file is public — you're publishing a map of what you want hidden |
| Uppercase filename (Robots.TXT) | Ignored entirely; crawlers only fetch lowercase robots.txt |
| Blocking a page that has a noindex tag | Crawlers can't reach the page to see the noindex, so it may stay indexed |
Frequently asked questions
Do I need a robots.txt file for my website?
Strictly no — crawlers assume full access without one. But every site benefits: it declares your sitemap and prevents accidental crawling of admin or cart pages. Small sites need only three lines.
How do I create a robots.txt file in WordPress?
Use your SEO plugin's editor (Yoast, Rank Math, AIOSEO — under Tools), or upload a physical file to your site root via FTP. A physical file overrides WordPress's virtual one.
What should a basic robots.txt contain?
A User-agent line, your Allow/Disallow rules, and a Sitemap: line with your XML sitemap's full URL.
How do I test if my robots.txt is working?
Open yourdomain.com/robots.txt in a browser, then check Google Search Console → Settings → robots.txt for fetch status and errors.