What is ‘robots.txt’ file?

You have private folders and membership areas on your website? Then you need this ‘robots.txt’ to stop the search engines to crawl in to your private folders. Once the search engines crawl in to your private folders, they are going to be indexed in the search engine results.

Robots.txt is a simple text file that tells the search engine robots not to crawl certain folders and pages of your web site.

When a search engine spider comes to your site first it looks for ‘robots.txt’ file in your root directory. If it is there, it reads it and follows the instructions. ALL search engines obey the robots.txt rules.

How to create it?

1. Open your Notepad or any other text editor. (Not html editor)

2. If you have a folder named ‘privatefolder’ which you don’t want to be indexed in SEs then you type like this-

User-agent:*
Disallow:/privatefolder/

‘*’ means all search engine robots. Second line tells the spiders that they are not allowed to crawl in to the ‘privatefolder’.

If you don’t want Yahoo to search ‘privatefolder’ then you can specifies it like -

User-agent: Slurp
Disallow:/privatefolder/

See the example robots.txt file here-

—————————————
# For domain: yourdomain.com

User-agent:*
Disallow:/cgi-bin/
Disallow:/privatefolder/
Disallow:/downloads/
Disallow:/important/members.html

# Edited on 29.01.09
—————————————

# indicates comments and those are for your rememberence only. Search engine spiders don’t read the lines that begin with #.
* indicates all search engines.

3. Save your text file as robots.txt on your computer. Upload it in ASCII mode in to your root directory. Then it would be like -

http://www.yourdomain.com/robots.txt

4. After uploading check the file for syntax errors at:

http://www.google.com/webmasters/tools/

Register for an account there with Google webmaster tools. Read this article:
http://www.google.com/support/webmasters/bin/answer.py?answer=35237

Google webmaster tools are very useful for optimizing your web site search engine performance.

5. TIPs:

* Never keep these lines because this bans all the search engines to spider your site and you never be listed in search engines.

User-agent:*
Disallow:/

* Always upload it in to your root directory. Should be like -
http://www.yourdomain.com/robots.txt

 

 

Web site Protection Tips:

* Step One: Protect folders by robots.txt - Stop search engine robots indexing Your private folders by ‘robots.txt’.
* Step Two: Protect folders by index file - Protect folders - Keep an INDEX file.
* Step Three: Protect folders by permissions - Folder protection - Directory and script file permissions.
* Step Four: Protect Your email addresses - Protect email address links - Stop email robots to get your email address.
* Step Five: Protect your downloads - Protect your downloads by password folder protection.
* Step Six: Protect your affiliate links - Affiliate link cloacking - Protect or hide your affiliate links.
* Step Seven: Protect Images - Image protection - Stop image hotlinking.

Author: Radhika (c)
http://www.webmasters-central.com/



Share/Save/Bookmark

One Response

  1. elmaher

    May 22nd, 2009 at 4:56 pm

    1

    thx alot for this topic


RSS feed for comments on this post · TrackBack URI

Leave a reply

Security Code: