Surf Disguised as Google Bot

Sure will be one or the other when "Googling", especially in the "Images" search mode, noticed that contained oddly pictures and data from access guns pages in the search index. Such results are characterized in that one does not get further on the landing page without previous login or user account.

But as the Google bot then came to the world, finally he did so in the search index?

The answer are HTTP headers, such as the browser id distinguish identify the Google bot as a bot and from the normal web surfers. Some Site Admins have therefore a browser switch tinkered certain bots to pass through, but requires normal web surfers an application.

This makes the content of a site for the Google bots available without so equal for the general release. Background for such behavior is, of course, that some sites owe their popularity to search engines respectively their findability, but derives its revenues by ABO, pay-per-view or similar models, which definitely require a user login.

But even for such Browerweichen there are elegant way to circumvent this as a normal user. Surely you can manipulate the HTTP header of the browser directly, which means Firefox Extensions even goes pretty easy. The course requires some knowledge and some handles.

It is better because of the way through a Web proxy as Be-The-Bot, must they not change the browser and the access to certain sites are "more or less" anonymous through the proxy.


  1. As interesting as the theory is: it must be already a legal Unversed SiteAdmin act when he granted access to Google in this way is actually protected content. Because Google has actually protected content, the ability to define appropriate access. So Google may log the visit itself and crawl the content.

    Nevertheless, an interesting article, and perhaps actually uses yes the one or the other Site Admin only one type "HTTP header" Soft. But I would recommend to anyone, not emulate this. 😉

  2. Now that's a well-written article, thanks. You must process first. Generally I find the blog easily accessible.

