In the world of SEO optimization, two small files often make a big difference: robots.txt and sitemap.xml.
Although simple, they play a crucial role in determining whether search engines can successfully crawl and index your website.
This article uses a real-world example — https://www.nuface.tw/, running WordPress 6.8.3 with the SureRank SEO Plugin — to explain how to set up both files and fix common indexing issues.
1. What is robots.txt?
robots.txt is a text file located in the root directory of your website.
It tells search engine crawlers which parts of your site they are allowed (or not allowed) to access.
📂 File location:
https://www.yourdomain.com/robots.txt
🧱 Main purposes:
- Restrict crawlers from accessing system or private directories (e.g.
/wp-admin/,/tmp/) - Guide search engines to your sitemap
- Reduce unnecessary crawling and server load
2. Recommended robots.txt for WordPress
If you’re using WordPress with an SEO plugin like SureRank, Yoast, or Rank Math,
a properly configured robots.txt might look like this:
# =====================================================
# robots.txt for https://www.nuface.tw/
# WordPress 6.8.3 + SureRank SEO Plugin
# =====================================================
User-agent: *
# Block search engines from crawling system directories
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
# Block non-valuable or duplicate pages
Disallow: /?s=
Disallow: /search
Disallow: /cgi-bin/
Disallow: /trackback/
Disallow: /xmlrpc.php
Disallow: /readme.html
Disallow: /?replytocom
Disallow: /wp-login.php
Disallow: /wp-register.php
# Allow admin AJAX (required by WordPress)
Allow: /wp-admin/admin-ajax.php
# Specify Sitemap (generated by SureRank)
Sitemap: https://www.nuface.tw/sitemap_index.xml
# =====================================================
# Notes:
# Search engines will automatically parse all sub-sitemaps
# listed inside sitemap_index.xml.
# =====================================================
3. Understanding SureRank’s Sitemap Structure
The SureRank SEO Plugin automatically generates an index-type sitemap,
which can be found here:
👉 https://www.nuface.tw/sitemap_index.xml
This index file contains several sub-sitemaps:
https://www.nuface.tw/post-type-page-sitemap-1.xml
https://www.nuface.tw/post-type-post-sitemap-1.xml
https://www.nuface.tw/post-type-sureforms_form-sitemap-1.xml
https://www.nuface.tw/taxonomy-type-category-sitemap-1.xml
Search engines only need to read the main index file (sitemap_index.xml).
They will automatically follow and crawl all sub-sitemaps listed within,
so you only need to include one Sitemap line in robots.txt.
4. Common Issue: Homepage Not Indexable
Sometimes, SureRank’s Site Analysis may display a critical warning like this:
⚠️ Homepage is not indexable by search engines
This means search engines are currently blocked from including your homepage in search results —
even if the rest of your site is accessible.
🧩 Possible Causes and Fixes
| Cause | Solution |
|---|---|
| WordPress visibility setting blocks indexing | Go to Settings → Reading and uncheck “Discourage search engines from indexing this site.” |
| SureRank homepage meta tag is set to “noindex” | Edit your homepage → SureRank → Advanced Settings → Meta Robots → Set to Index. |
| robots.txt mistakenly blocks the entire site | Open https://www.nuface.tw/robots.txt and make sure there’s no Disallow: /. |
| Theme or plugin added a “noindex” meta tag | View page source and ensure there’s no <meta name="robots" content="noindex">. |
5. How to Verify After Fixing
- Resubmit Your Sitemap
Visit Google Search Console → Sitemaps
and submit:sitemap_index.xml - Search Validation
In Google Search, type:site:nuface.twIf your homepage appears in the results, indexing is working properly. - Wait for Updates
Google may take 1–3 days to re-crawl and refresh the index after changes.
6. Conclusion
Configuring robots.txt and sitemap.xml correctly is the foundation of good SEO hygiene.
After every theme update, plugin installation, or site redesign, it’s wise to double-check:
- Whether
robots.txtstill allows indexing - Whether your sitemap opens correctly
- Whether the homepage is not marked as
noindex
Once properly set up, SureRank will automatically help Google and Bing discover and index your site faster —
ensuring your content and brand stay visible in search results.
Author’s Note:
This tutorial is based on the actual setup of https://www.nuface.tw/.
It applies to any website running WordPress with the SureRank SEO Plugin.