<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>WebSiteSecrets101.com &#187; askjeeves</title>
	<atom:link href="http://www.websitesecrets101.com/tag/askjeeves/feed" rel="self" type="application/rss+xml" />
	<link>http://www.websitesecrets101.com</link>
	<description>Web site secrets and SEO tips</description>
	<lastBuildDate>Sat, 20 Feb 2010 03:02:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Harnessing the Power of Robots.txt</title>
		<link>http://www.websitesecrets101.com/harnessing-the-power-of-robotstxt</link>
		<comments>http://www.websitesecrets101.com/harnessing-the-power-of-robotstxt#comments</comments>
		<pubDate>Wed, 05 Apr 2006 03:31:44 +0000</pubDate>
		<dc:creator>Bruce</dc:creator>
				<category><![CDATA[Web Hosting]]></category>
		<category><![CDATA[askjeeves]]></category>
		<category><![CDATA[linux systems]]></category>
		<category><![CDATA[public html directory]]></category>
		<category><![CDATA[reading resources]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[website secrets]]></category>

		<guid isPermaLink="false">http://www.websitesecrets101.com/8/harnessing-the-power-of-robotstxt/</guid>
		<description><![CDATA[<p>By Bruce Hearder</p>
<p>Once we have a website up and running, we need to make sure that all visiting search engines can access all the pages we want them to look at.</p>
<p>Sometimes, we may want search engines to not index certain parts of the site, or even ban other SE from the site all together.</p>
<p>This is where a simple, little 2 line text file called robots.txt comes in.</p>
<p>Robots.txt resides in your websites main directory (on LINUX systems this is your /public_html/ directory), and looks something like the following:</p>
<p>User-agent: *<br />
Disallow:</p>
<p>The first line controls the &#8220;bot&#8221; that will be visiting your site, the second line controls if they are allowed in, or which parts of the site they are not allowed to visit&#8230;</p>
<p>If you want to handle multiple &#8220;bot&#8221;, then simple repeat the above lines.<br />
So an example:</p>
<p>User-agent: googlebot<br />
Disallow:</p>
<p>User-agent: askjeeves<br />
Disallow: /</p>
<p>This will allow Goggle (user-agent name GoogleBot) to visit every page and directory, while at the same time banning Ask Jeeves from the site completely.<br />
To find a â€œreasonablyâ€ up to date list of robot user names this visit http://www.robotstxt.org/wc/active/html/index.html</p>
<p>Even if you want to allow every robot to index every page of your site, it&#8217;s still very advisable to put a robots.txt file on your site.<br />
It will stop your error logs filling up with entries from search engines trying to access your robots.txt file that doesn&#8217;t exist.</p>
<p>For more information on robots.txt see, the full list of resources about robots.txt at http://www.websitesecrets101.com/robotstxt-further-reading-resources/</p>
<p>Find out more about website secrets tip tricks and other ideas at Bruce Hearder&#8217;s <a href="http://www.WebsiteSecrets101.com" target="_self">Website Secrets 101</a> website. Visit the site now at Website Secrets<br />
<a href="http://www.WebsiteSecrets101.com">http://WebsiteSecrets101.com</a></p>
]]></description>
		<wfw:commentRss>http://www.websitesecrets101.com/harnessing-the-power-of-robotstxt/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
