Now that Google is completely removing support for the noindex directive in robots.txt files, Google is sending out notifications to those that have such directives. On the morning of Jul 29, a number of folks in the SEO community began to recieve notifications from Google Search Console with the subject line, “remove “noindex” statements from the robots.txt of…”
Here is a screen shot by Bill Hartzer on Twitter:
September 1 is when you no longer have to depend on the noindex mention in your robots.txt file.
If and when you get this message, make sure whatever is mentioned in this noindex directive is supported in different way. The key factor here is to be sure that you’re not using the noindex directive in the robots.txt file. Make sure the changes are made before September 1. If you are using the nofollow or crawl-delay commands, look to use the true supported method for those directives going forward.
Some of the alternatives mentioned by Google are the following:
- Noindex in robots meta tags: Supported both in the HTTP response headers and in HTML, the noindex directive is the most effective way to remove URLs from the index when crawling is allowed.
- 404 and 410 HTTP status codes: Both status codes mean that the page does not exist, which will drop such URLs from Google’s index once they’re crawled and processed.
- Password protection: Unless markup is used to indicate subscription or paywalled content, hiding a page behind a login will generally remove it from Google’s index.
- Disallow in robots.txt: Search engines can only index pages that they know about, so blocking the page from being crawled often means its content won’t be indexed. While the search engine may also index a URL based on links from other pages, without seeing the content itself, we aim to make such pages less visible in the future.
- Search Console Remove URL tool: The tool is a quick and easy method to remove a URL temporarily from Google’s search results.