Email from Google: Robots.txt File Creates Issues for Google Product Search Images

Yesterday, Google sent an email about failed image crawls for merchants that are submitting a data feed to Google Product Search but also blocking the ‘googlebot’ from accessing their site’s images.  This is how it read:

 

Hello,

Thank you for participating in Google Product Search. It has come to our attention that a robots.txt file is preventing us from crawling some or all of the images on your site. In order for us to access and display the images you provide in your product listings, we’d like you to modify your robots.txt file to allow user-agent ‘googlebot’ to crawl your site.  Failure for Google to access your images may affect the visibility of your items on Google Product Search and Product Ad results.

To ensure the ‘googlebot’ is not being blocked, please add the following two lines of text to the end of your robots.txt file:

User-agent: googlebot

Disallow:

For more information on robots.txt files, please visit http://www.robotstxt.org. If you have any questions, please contact your webmaster directly.

Sincerely,

The Google Product Search Team

 

The vague and somewhat ambiguous email from Google sends merchants to http://www.robotstxt.org where they can filter through the sea of information about robots.txt files on their own.  What’s interesting, though, is that Yahoo Stores that host images on the ep.yimg.com domain seem to be the main target of the recent googlebot crawl error, according to atensoft in the Google Help Forum thread.

 

This appears to be an authentic email, and many of our Yahoo Store clients have also received it.  It appears to be affecting stores whose images are hosted by Yahoo Store on the ep.yimg.com domain.

Here is an example image URL:
http://ep.yimg.com/ca/I/yhst-65077491912261_2151_39904715

The only robots file on this server under any of the above sub-folders is:
http://ep.yimg.com/ca/robots.txt

The robots file excludes the /I/ directory, which is why Google Product Search bot is not crawling the images.

However, according to http://www.robotstxt.org/robotstxt.html – the robots.txt is invalid unless it is placed in the root of the website, like this:
http://ep.yimg.com/robots.txt

 

So we have two problems:
1. Yahoo Store is blocking Google (and everyone else) from indexing the images.
2. Google is honoring an invalid robots.txt file.

This is a pickle for Yahoo Store merchants.  However, if I were to place blame, I’d place it on Google for not honoring the robots.txt specification.  I’ve never heard of anyone, anywhere honoring a robots.txt in a sub-folder.

Now, for someone to say that Google is maliciously targeting Yahoo! Store platforms might be a bit of a stretch and obviously pure speculation at this point, but it wouldn’t be the first time that Google and Yahoo! didn’t see eye to eye.  Throw in Bing’s growing U.S. search market share and we have the recipe for a prolonged rivalry between the two search giants.

 

Below are some links to recent threads from merchants that have received this email from Google Product Search.

 

Feel free to contact us if you recently received the email from Google regarding a robots.txt file and would like to discuss what this means for your Google Product Search campaign.

 

About the Author+David Weichel is the Director of Paid Search at CPC Strategy. He specializes in conversion rate optimization, search behavior research and attribution analysis. David graduated from the University of California, San Diego with a B.S. in Management Science. See all posts by this author here.