Last October, Paul Shapiro wrote how you could use a Python script to determine whether a page has been indexed by Google in the SERPS. In the end, Google’s Gary Illyes wasn’t very happy wasn’t thrilled with the technique that was being utilized by the script.
— Gary Illyes ᕕ( ᐛ )ᕗ (@methode) October 5, 2016
@greenlaneseo Is this a blackhat tool or does it abide by the webmaster guidelines & robots.txt? (just curious)
— John ☆.o(≧▽≦)o.☆ (@JohnMu) December 14, 2016
How is it possible to learn what pages aren’t index by Google? How do we do it in a way that doesn’t break Google’s rules? It looks like Google doesn’t indicate if a page has been indexed in Google Search Console, and they won’t let us scrape search results to get the answer. They also don’t like the idea of getting it indirectly from an undocumented API.
So how can we determine which of your site pages aren’t indexed without breaking Google’s rules and guidelines? Paul Shapiro shares some of his methods on this matter. Check out his post on Search Engine Land by following the link I’ve provided below.