|
|
|
|
[../../_private/disc1_ahdr.htm]
Controlling Web RobotsFrom: Wes CommentsThe quick way to prevent robots visiting your site is put these two lines into the /robots.txt file on your server: User-agent: * Disallow: / By writing a structured text file you can indicate to robots that certain parts of your server are off-limits to some or all robots. It is best explained with an example: # this is the /robots.txt file for http://www.blahblah.com/ User-agent: webthanger User-agent: lycra User-agent: * The first two lines, starting with '#', specify a comment. The first paragraph specifies that the robot called 'webthanger' has nothing disallowed, it may go anywhere. The second paragraph indicates that the robot called 'lycra' has all relative URLs starting with '/' disallowed. Because all relative URL's on a server start with '/', this means the entire site is closed off. The third paragraph indicates that all other robots should not visit URLs starting with /tmp or /log. Note the '*' is a special token, meaning "any other User-agent"; you cannot use wildcard patterns or regular expressions in either User-agent or Disallow lines. Two common errors: Wildcards are _not_ supported: instead of 'Disallow: /tmp/*' just say 'Disallow: /tmp'. You shouldn't put more than one path on a Disallow line (this may change in a future version of the spec) There is a new standard for using HTML META tags to keep robots out of your documents. The basic idea is that if you include a tag like: <META NAME="ROBOTS" CONTENT="NOINDEX"> in your HTML document, that document won't be indexed. If you do: <META NAME="ROBOTS" CONTENT="NOFOLLOW"> the links in that document will not be parsed by the robot. A short, plain language description of the document may be provided to search engines by using the 'description' meta tag. <META NAME="description" CONTENT="Citrus fruit wholesaler."> Keywords may be provided to search engines in addition to words from the title and document body by using the 'keywords' meta tag. <META NAME="keywords" CONTENT="oranges, lemons, limes"> [../../_private/disc1_aftr.htm] |
|
|