Adding a Robots Text File

What is a Robots Text File?

A robots text file instructs the web bots/crawlers on exactly what to index and what not to index.  For instance if you do not want certain content indexed, you can disallow access to the spiders. This will keep that content of the specified page/ directory out of the search engines results pages.
You should have the correctly formatted Robots Text file uploaded to your HTML directory.

Here is an example:

User-Agent: *
Sitemap: http://www.mysite.com/sitemap.xml
Disallow: /page-I-want-to-block.html
Disallow: /directory-I-want-to-block

This example can be saved to a text file using notepad and edited an necessary.  It should be saved as robots.txt and uploaded to your HTML directory using file manager or FTP.
You can also use Google Webmaster Tools to generate and test your robots.txt file.  Log on and go to Site Configuration, Crawler Access, Test or Generate.

Here’s what the commands mean.

User-Agent: is the bot/crawler. ( they all have names) i.e Googlebot, Yahoo is Slurp, Bing is MSNBot,  * means all bots/crawlers are allowed.
Sitemap: Directs the bot to your XML sitemap
Disallow: Blocks bots from accessing pages and directories.  You can also block particular bots that follow the correct protocol.

For more information on the Robots Text file please see http://www.robotstxt.org/

No Comments Yet.

Leave a comment