Learn how to configure XML sitemaps and robots.txt in WordPress so search engines can crawl your site efficiently without exposing sensitive areas.
Why XML Sitemaps and robots.txt Matter for Security and SEO
Your XML sitemap and robots.txt file quietly control how search engines discover and crawl your WordPress site. When configured well, they help search engines find your important pages faster and avoid wasting crawl budget on low?value URLs. When configured poorly, they can accidentally expose sensitive paths or block key content from being indexed.
In this guide, you’ll learn how to:
- Understand what XML sitemaps and
robots.txtdo - Enable and review WordPress core sitemaps safely
- Create a safe, minimal
robots.txtfor typical business sites - Avoid common mistakes that harm SEO or leak information
- Verify that search engines see what you expect
Key Concepts: XML Sitemaps vs robots.txt
What an XML Sitemap Does
An XML sitemap is a machine?readable file that lists important URLs on your site and can include metadata like last modified dates. Search engines use it as a discovery aid, not a guarantee of indexing. WordPress core includes a built?in sitemap system that exposes /wp-sitemap.xml by default on public sites running recent versions of WordPress.Source
What robots.txt Does (and Does Not Do)
The robots.txt file sits at the root of your domain (for example, example.com/robots.txt) and tells compliant crawlers which paths they are allowed or disallowed to crawl. It is a crawl directive, not a security barrier. Sensitive content must still be protected with proper authentication and authorization.
Modern search engines also support a Sitemap: directive inside robots.txt to help them discover your XML sitemap more easily.Source
Step 1 – Confirm That WordPress XML Sitemaps Are Enabled
Most Compass Production builds rely on WordPress core sitemaps or a dedicated SEO plugin. Before editing anything, confirm what is currently active.
How to Check Your Sitemap
- Open a browser and go to
https://yourdomain.com/wp-sitemap.xml. - If you see a structured XML index listing
post-sitemap.xml,page-sitemap.xml, etc., core sitemaps are active. - If you see a different sitemap URL (for example, from an SEO plugin), note that location.
- If you see a 404 error or a blank page, your sitemap may be disabled or blocked by another plugin.
Check Search Engine Visibility Setting
WordPress has a global setting that can disable indexing and affect sitemap behavior.
- In WordPress, go to Dashboard ? Settings ? Reading.
- Find the option “Search engine visibility”.
- Make sure “Discourage search engines from indexing this site” is unchecked on live, public sites.
When this box is checked, WordPress sends a noindex signal and may discourage crawlers from using your sitemap on production sites.Source
Step 2 – Decide Which Content Should Be in Your Sitemap
For most business websites, you want search engines to focus on:
- Public pages (home, services, about, contact, landing pages)
- Public blog posts or resources
- Key custom post types that represent public content (e.g., portfolio items)
You usually do not want to include:
- Admin or system URLs (anything under
/wp-admin/) - Internal search results pages
- Test or staging content
- Private or password?protected content
If you use an SEO plugin, it will typically provide controls to include or exclude specific post types and taxonomies from the sitemap. Keep the configuration simple: include only content that should be discoverable and useful in search results.
Step 3 – Create a Safe, Minimal robots.txt
Next, you’ll configure a robots.txt file that:
- Allows search engines to crawl your public content
- Blocks obvious non?public paths like
/wp-admin/ - References your XML sitemap location
Where to Edit robots.txt in WordPress
There are three common ways this file might be managed:
- SEO plugin UI – Many SEO plugins provide a “File editor” or “robots.txt” screen.
- Virtual robots.txt – WordPress can serve a dynamic
robots.txtif no physical file exists. - Physical file on the server – A real
robots.txtfile in your site’s web root.
If Compass Production manages your hosting, we typically recommend using the SEO plugin’s interface so changes are version?controlled and easy to audit.
Recommended robots.txt Template for Most Sites
Use this as a starting point and adjust only if you have a specific need:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://yourdomain.com/wp-sitemap.xml
This pattern is consistent with common guidance for WordPress sites and allows crawlers to access necessary AJAX endpoints while avoiding the admin area.Source
Step 4 – Avoid Common Security and SEO Mistakes
Mistake 1: Using robots.txt as a Security Tool
Blocking sensitive paths (for example, /private/) in robots.txt does not protect them. The file is public and can actually highlight where sensitive content lives. True protection requires authentication, authorization, and, where appropriate, proper HTTP status codes (401/403) or password protection at the server level.Source
Mistake 2: Blocking All Crawlers on a Live Site
Sometimes a site is launched with a legacy Disallow: / rule or the WordPress “Discourage search engines” option still enabled from staging. This can prevent your site from being indexed at all. Always review both robots.txt and the Reading settings before or immediately after launch.
Mistake 3: Over?Blocking Assets (CSS, JS, Images)
Modern search engines render pages like a browser, which means they need access to CSS, JavaScript, and image files to understand layout and mobile friendliness.Source Avoid broad Disallow rules that block /wp-content/ or theme/plugin asset directories unless you have a very specific, tested reason.
Step 5 – Verify What Search Engines See
Check robots.txt in a Browser
- Visit
https://yourdomain.com/robots.txtin your browser. - Confirm the content matches what you expect (user?agent rules and sitemap URL).
- Check for typos such as
SiteMapinstead ofSitemapor missing colons.
Use Search Console’s robots.txt and URL Inspection Tools
Google Search Console provides tools to test how Googlebot sees your site:
- Verify that your
robots.txtfile is accessible and valid. - Use URL Inspection to confirm important URLs are not blocked by
robots.txt. - Submit your XML sitemap URL under Indexing ? Sitemaps.
What You Should See
After completing the steps above, you should observe:
- Visiting
/wp-sitemap.xml(or your plugin’s sitemap URL) shows a clean list of public content types only. - Visiting
/robots.txtshows a short, readable file with a generalUser-agent: *block, aDisallow: /wp-admin/rule, and a correctSitemap:line. - In Google Search Console, your sitemap status is reported as “Success” and key URLs are not blocked by
robots.txt. - No obviously sensitive or internal paths (such as staging URLs or private directories) are listed in your sitemap or highlighted in
robots.txt.
When to Contact Compass Production
Reach out to the Compass Production team before changing sitemap or robots.txt settings if:
- You run a complex site with multiple subdomains or language versions.
- You use advanced caching, a CDN, or firewall rules that might cache or override
robots.txt. - You suspect that important pages have dropped from search results after a configuration change.
We can help you review your configuration, coordinate with hosting, and test changes in a staging environment so your live site remains stable, secure, and search?friendly.