What Is robots.txt in WordPress and How to Edit It (2026 Guide)

Posted: Apr 11, 2026 | SEO

If you’ve ever wondered why certain pages on your site aren’t showing up in Google — or why your crawl budget seems to get eaten up on pages that don’t matter — your robots.txt file is likely at the center of it. It’s one of the smallest files on your entire website, but it carries enormous weight in how search engines, AI bots, and other crawlers interact with your content.

In this guide, I’ll walk you through exactly what robots.txt is in WordPress, how WordPress handles it by default, how to edit it the right way, and the modern rules you need to follow in 2026 — including how to handle AI crawlers like GPTBot and ClaudeBot.

robots.txt in simple terms:

  • Controls crawling
  • Doesn’t directly control indexing
  • Helps optimize crawl efficiency

What Is robots.txt?

robots.txt is a plain text file that lives in the root directory of your website — typically accessible at yourdomain.com/robots.txt. It follows a protocol called the Robots Exclusion Protocol (REP), which tells web crawlers which parts of your site they’re allowed to visit and which they should skip.

Think of it like a set of house rules you post at your front door. Most well-behaved bots — Googlebot, Bingbot, and most major AI crawlers — will read and respect those rules before they ever touch a single page on your site.

It’s important to understand one thing right from the start: robots.txt controls crawling, not indexing. If you block a page using robots.txt, Google won’t crawl it — but if another website links to that page, Google may still index it and show it in search results. If you want a page completely out of Google’s index, you need a noindex meta tag, not a robots.txt rule. This is one of the most common and costly misunderstandings I see.

How WordPress Handles robots.txt by Default

Here’s something most beginners don’t realize: WordPress doesn’t create a physical robots.txt file on your server. Instead, it generates a virtual one on the fly whenever a crawler requests it.

The default virtual robots.txt that WordPress generates looks like this:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://yourdomain.com/sitemap.xml

Let me break this down:

  • User-agent: * — This rule applies to all crawlers.
  • Disallow: /wp-admin/ — Blocks bots from entering your admin dashboard. You don’t want this indexed, and it’s a sensible default.
  • Allow: /wp-admin/admin-ajax.php — Grants access to this specific file even though the broader /wp-admin/ folder is blocked. This file handles front-end AJAX requests used by many plugins, so bots need access to it.
  • Sitemap: — Points crawlers directly to your XML sitemap so they can discover and index your content efficiently.

This default is reasonably solid for most small sites. But as your site grows, or if you’re running an ecommerce store, blog with hundreds of pages, or a site with sensitive areas, you’ll need to customize it.

Why robots.txt Matters for Your SEO

Search engines don’t have unlimited resources. Google allocates what’s called a crawl budget to every website — a finite number of pages it will crawl in any given visit. If your site has hundreds of low-value URLs (internal search results, tag pages, admin paths, duplicate content), Google’s bots may spend their entire budget on junk and never reach your best content.

A well-configured robots.txt file helps you:

  • Protect your crawl budget by blocking URLs that don’t need to be indexed
  • Speed up indexing of important pages by directing bots where to go
  • Reduce server load by keeping wasteful or aggressive bots out
  • Point crawlers to your sitemap for efficient content discovery
  • Protect sensitive or irrelevant sections (like admin pages)
  • Control AI crawlers that are increasingly shaping how your content surfaces in AI-generated answers

How to View Your Current robots.txt

Before you edit anything, check what you currently have. Just open your browser and go to:

https://yourdomain.com/robots.txt

If you see content there, your file either exists as a physical file on your server or as WordPress’s virtual version. If you see a 404 error, you have no file at all — which isn’t necessarily a crisis, but it does mean you have no crawl control in place.

How to Edit robots.txt in WordPress

There are three main ways to do this. I’ll walk through each one.

Method 1: Using an SEO Plugin (Recommended for Most Users)

This is the approach I’d recommend for most WordPress site owners. It’s safe, reversible, and doesn’t require FTP access.

Using Yoast SEO:

  1. Go to your WordPress dashboard → SEO → Tools
  2. Click File Editor
  3. If no robots.txt exists, click Create robots.txt file
  4. Edit the file directly in the editor
  5. Click Save changes

Using All in One SEO (AIOSEO):

  1. Go to All in One SEO → Tools
  2. Click the Robots.txt tab
  3. Toggle on Enable Custom Robots.txt
  4. Add or edit your rules
  5. Click Save Changes

Using Rank Math:

  1. Go to Rank Math → General Settings → Edit robots.txt
  2. Add your custom rules directly

All three plugins make it easy to add rules without touching a text file, and they handle the override of WordPress’s virtual robots.txt automatically.

Method 2: Manually via FTP or File Manager

If you want direct control or your SEO plugin doesn’t have a file editor, you can create and upload the file manually.

  1. Open a text editor (Notepad on Windows, TextEdit on Mac in plain text mode)
  2. Write your robots.txt rules (see examples below)
  3. Save the file as robots.txt — all lowercase, no spaces
  4. Connect to your server via FTP (FileZilla is a popular free client) or use your hosting control panel’s File Manager
  5. Navigate to the public_html folder — this is your website’s root directory
  6. Upload robots.txt here

Once you upload a physical file, it overrides WordPress’s virtual version. This gives you complete control.

Method 3: Via cPanel or Hosting File Manager (No FTP Needed)

Most hosts — including Bluehost, SiteGround, Hostinger, Cloudways, and others — have a built-in file manager in their control panel.

  1. Log into your hosting control panel
  2. Open File Manager
  3. Navigate to public_html
  4. If a robots.txt file already exists, click to edit it. If not, create a New File, name it robots.txt, and open it for editing
  5. Add your directives and save

Basic robots.txt Rules (With Examples)

Allow all bots
User-agent: *
Disallow:

Block WordPress admin area (Recommended)
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Block specific pages
Disallow: /thank-you/
Disallow: /checkout/

Add your sitemap (Important)
Sitemap: https://yourdomain.com/sitemap_index.xml

This helps search engines find your important pages faster.

Pro Tip: Never block CSS or JS files in robots.txt — Google needs them to properly render your site.

robots.txt Directives Explained

Before you start writing rules, you need to understand the four main building blocks:

DirectiveWhat It Does
User-agentSpecifies which crawler the rules apply to. * means all crawlers.
DisallowTells the specified bot not to crawl a path or directory.
AllowExplicitly permits access to a path, even within a disallowed directory.
SitemapPoints crawlers to your XML sitemap URL.

Lines starting with # are comments — they’re ignored by crawlers and are useful for leaving notes in your file.

A Solid WordPress robots.txt Template for 2026

Here’s the optimized robots.txt template I recommend for most WordPress sites:


# Allow all crawlers full access by default
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-login.php
Disallow: /?s=
Disallow: /search/
Allow: /wp-admin/admin-ajax.php

# Block AI training crawlers (does NOT affect Google Search)
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

# Allow AI search/citation bots (these send you traffic)
User-agent: OAI-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

# Sitemap
Sitemap: https://yourdomain.com/sitemap.xml

Let me explain the key decisions here:

  • Disallow: /?s= blocks WordPress internal search result URLs, which create near-infinite duplicate content paths and destroy crawl budget.
  • Disallow: /wp-login.php keeps your login page out of crawlers’ paths.
  • GPTBot and ClaudeBot are AI training crawlers — blocking them prevents your content from being used to train AI models, but has zero effect on your Google rankings.
  • Google-Extended is Google’s AI training crawler for Gemini — blocking it does not affect your visibility in regular Google Search or AI Overviews.
  • OAI-SearchBot and PerplexityBot are search and citation bots — these actually reference your content in AI-generated answers and can send you referral traffic, so you generally want to allow them.

The AI Crawler Decision You Need to Make in 2026

This is something almost no beginner-level robots.txt guide properly addresses, and it’s now one of the most important decisions you’ll make for your content strategy.

There are now three distinct types of AI crawlers visiting your site:

  1. AI Training Crawlers — These collect your content to train large language models. You get no attribution, no link, and no traffic. Examples: GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended (Google Gemini training).
  2. AI Search/Citation Bots — These fetch your content in real time to answer user queries, and they typically cite your page and link back to it. Examples: OAI-SearchBot, PerplexityBot, Claude-SearchBot.
  3. AI Agents — These browse on behalf of specific users in real time, like ChatGPT-User. They behave more like regular browsers than crawlers.

The strategic move most publishers are making in 2026: block training crawlers, allow search and citation bots. This protects your content from being absorbed into AI models without credit, while keeping you visible inside ChatGPT, Perplexity, and other AI search tools that can drive you traffic.

To completely block AI training while staying open to AI search:


User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

Note: Blocking Google-Extended only prevents your content from training Google’s Gemini model — it has no impact on your rankings in regular Google Search. Many site owners confuse this.

Common robots.txt Mistakes to Avoid in WordPress

These are the errors I see most often when auditing WordPress sites, and some of them can quietly destroy your traffic for months before anyone notices.

  1. Blocking CSS and JavaScript files
    Google needs to render your pages to understand them — blocking /wp-content/ or /wp-includes/ prevents Google from seeing your site as users do. This can tank your rankings and cause Core Web Vitals issues.
  2. Using robots.txt to hide sensitive content
    This is a security trap. robots.txt is a public file. If you list a private directory in it, you’re actually advertising its location to anyone who looks. Use proper authentication or password protection for sensitive pages, and use noindex meta tags to keep pages out of search results.
  3. Forgetting to update after a staging migration
    One of the most expensive mistakes I’ve seen: a developer copies a staging configuration (which typically has Disallow: / to block all bots) into production. Traffic drops to zero. Always check your robots.txt immediately after a site migration or launch.
  4. Blocking internal search pages inconsistently
    If you’re going to block /?s=, make sure you’re consistent with case and path format. robots.txt is case-sensitive, so /?s= and /?S= are treated differently by crawlers.
  5. Confusing crawl blocking with de-indexing
    Blocking a URL in robots.txt stops crawling, not indexing. If the page has backlinks from other sites, Google can still index it — just without visiting it. Use a noindex meta tag if you want the page out of search results entirely.
  6. Blocking category pages on WooCommerce sites
    Some guides recommend blocking wp-content/plugins or category URLs to save crawl budget. On ecommerce sites, this can remove your most important indexable pages. Be very deliberate about what you block on any store with a deep product catalog.

How to Test Your robots.txt File

After making any changes, you should always verify your file is working correctly before moving on. Testing your robots.txt ensures that search engines can access important pages and aren’t being accidentally blocked.

Google Search Console (most reliable):

  1. Log into Google Search Console
  2. Go to Settings → Crawl stats to see how Googlebot is interacting with your site.
  3. You can also use the URL Inspection tool to test whether specific pages are being blocked by robots.txt.

You can also manually test specific URLs using the URL Inspection tool inside Search Console to see whether Googlebot can access a given page.

Manual browser check (quick method):

Simply visit yourdomain.com/robots.txt in your browser. You should see your file content rendered as plain text. If you see a 404, either the file doesn’t exist or it’s in the wrong location.

Third-Party robots.txt testing tools:

Tools like Screaming Frog’s SEO Spider and dedicated robots.txt testers can validate your syntax and simulate how crawlers will interpret your rules before you push changes live.

robots.txt vs. llms.txt — What’s the Difference?

You may have started hearing about a newer file called llms.txt. Here’s how the two relate:

  • robots.txt — Controls whether a bot can crawl specific paths on your site. It’s a decades-old standard that all major crawlers respect.
  • llms.txt — A newer, not-yet-formalized convention that tells AI models what your site is about and which pages matter most. Think of it as a sitemap built specifically for AI systems rather than search engines.

The two files operate on different levels and complement each other. robots.txt handles access control. llms.txt handles content guidance and context for AI models. If you want to appear in AI-generated answers (what’s called Generative Engine Optimization or GEO), having both files set up correctly is increasingly important.

llms.txt isn’t a WordPress-specific concept, but you can add it manually to your root directory the same way you’d add a physical robots.txt file.

robots.txt Checklist Before You Save Changes

A small mistake in robots.txt can block your entire site — use this checklist to verify everything before saving.

  • No accidental site-wide block (Disallow: /) is present
  • CSS and JavaScript files are not accidentally blocked
  • /wp-admin/ is blocked, but /wp-admin/admin-ajax.php is explicitly allowed
  • Internal WordPress search (/?s=) is disallowed
  • You’ve added your sitemap URL as an absolute URL (e.g., https://yourdomain.com/sitemap.xml)
  • You’ve made a deliberate decision on AI training crawlers (GPTBot, ClaudeBot, Google-Extended)
  • You’ve verified the file at yourdomain.com/robots.txt in your browser
  • You’ve tested it in Google Search Console

Key Takeaways for Your robots.txt File

Your robots.txt file is one of the first things every crawler checks when it lands on your site. Get it wrong and you could be silently blocking your most important content from Google — sometimes for months without realizing it. Get it right, and it becomes a quiet but powerful lever for SEO efficiency.

The rules have also gotten more nuanced. In 2026, you’re not just managing Googlebot and Bingbot — you’re making strategic decisions about which AI systems get to learn from your content and which ones get to cite it in real-time answers. That’s a new kind of technical SEO decision that didn’t exist just a couple of years ago.

Start with the template I shared above, customize it to your site’s needs, verify it in Search Console, and set a reminder to audit it every few months — especially after major site updates or migrations.

Frequently Asked Questions

What does robots.txt do in WordPress?

It tells search engines which parts of your website they can or cannot crawl.

Does robots.txt affect my Google rankings?

Not directly. It controls which pages Google crawls, not how they rank. However, blocking important pages or wasting your crawl budget on junk pages can indirectly hurt your rankings by slowing indexing or keeping your best content undiscovered.

Where is robots.txt located in WordPress?

You can find it at yourdomain.com/robots.txt or in your site’s root directory.

How do I edit robots.txt safely?

Use an SEO plugin like Rank Math or Yoast, and always take a backup before making changes.

Can I block all AI bots without hurting my SEO?

Blocking AI training crawlers like GPTBot, ClaudeBot, and Google-Extended has no effect on your standard Google Search rankings. However, blocking AI search bots like OAI-SearchBot or PerplexityBot means your content won’t appear in those AI-powered answer engines.

How often does Google re-read my robots.txt?

Google typically checks your robots.txt file every 24 hours.

Is robots.txt enough to secure private content?

No. robots.txt is a public file and doesn’t enforce any access restriction. For truly sensitive content, use server-side authentication, password protection, or .htaccess rules.

What happens if I don’t have a robots.txt file at all?

Most crawlers will simply treat the entire site as crawlable. It’s not an emergency, but you lose the ability to manage crawl budget, block low-value paths, or point bots to your sitemap.

Not Sure If Your robots.txt Is Hurting Your SEO?

If your robots.txt is misconfigured, it can quietly block important pages from search engines. I can audit your file, identify issues, and fix them before they impact your rankings.

Get My robots.txt Audit

Found this useful? Please share it with your network.
Website designer and Technical SEO specialist in India

ABOUT THE AUTHOR

Sangeetha M

Web Designer & WordPress Blogger

Sangeetha is a WordPress & SEO specialist with 15+ years of experience designing and building websites, sharing practical tutorials and beginner-friendly guides on WordPress, SEO, and website growth.

More on This Topic