<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Tell the Robots Where Not to Go</title>
	<atom:link href="http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html</link>
	<description>Bringing the Internet into Focus!</description>
	<lastBuildDate>Thu, 15 Dec 2011 22:47:36 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: LGR</title>
		<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html#comment-7847</link>
		<dc:creator>LGR</dc:creator>
		<pubDate>Mon, 26 Sep 2011 16:51:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.blog2.lgr.ca/2007/03/tell-the-robots-where-not-to-go.html#comment-7847</guid>
		<description>Very true! My robots.txt file is actually out of date, since there are folders listed that no longer exist. In theory one could list a folder and use it to trap hackers looking for exploits. I am sure a quick Google search would find something.</description>
		<content:encoded><![CDATA[<p>Very true! My robots.txt file is actually out of date, since there are folders listed that no longer exist. In theory one could list a folder and use it to trap hackers looking for exploits. I am sure a quick Google search would find something.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex</title>
		<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html#comment-7842</link>
		<dc:creator>Alex</dc:creator>
		<pubDate>Mon, 26 Sep 2011 16:26:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.blog2.lgr.ca/2007/03/tell-the-robots-where-not-to-go.html#comment-7842</guid>
		<description>It&#039;s worth pointing out (and/or reminding) that robots.txt is always open and visible to anyone who puts that filename in the url. (so I could put in www.lgr.ca/robots.txt and see the robots.txt file)

So don&#039;t hide important / super-secret files and folders by listing them in robots.txt . It&#039;s the first place hackers look to see if they can find what&#039;s now indexed.</description>
		<content:encoded><![CDATA[<p>It&#8217;s worth pointing out (and/or reminding) that robots.txt is always open and visible to anyone who puts that filename in the url. (so I could put in <a href="http://www.lgr.ca/robots.txt" rel="nofollow">http://www.lgr.ca/robots.txt</a> and see the robots.txt file)</p>
<p>So don&#8217;t hide important / super-secret files and folders by listing them in robots.txt . It&#8217;s the first place hackers look to see if they can find what&#8217;s now indexed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: WordPress Ninja Affiliate for Free - LGR Internet Solutions</title>
		<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html#comment-1058</link>
		<dc:creator>WordPress Ninja Affiliate for Free - LGR Internet Solutions</dc:creator>
		<pubDate>Wed, 15 Jul 2009 18:49:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.blog2.lgr.ca/2007/03/tell-the-robots-where-not-to-go.html#comment-1058</guid>
		<description>[...] BlogMechanics KeywordLink allow you to do this. You can also add your custom URL&#8217;s to your robots.txt file to prevent the search engine spiders from following your affiliate links. For example I could [...]</description>
		<content:encoded><![CDATA[<p>[...] BlogMechanics KeywordLink allow you to do this. You can also add your custom URL&#8217;s to your robots.txt file to prevent the search engine spiders from following your affiliate links. For example I could [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: LGR</title>
		<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html#comment-84</link>
		<dc:creator>LGR</dc:creator>
		<pubDate>Mon, 02 Apr 2007 19:18:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.blog2.lgr.ca/2007/03/tell-the-robots-where-not-to-go.html#comment-84</guid>
		<description>It is just a little to metaphysical for me these days Rhett.</description>
		<content:encoded><![CDATA[<p>It is just a little to metaphysical for me these days Rhett.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rhett</title>
		<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html#comment-83</link>
		<dc:creator>Rhett</dc:creator>
		<pubDate>Mon, 02 Apr 2007 18:22:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.blog2.lgr.ca/2007/03/tell-the-robots-where-not-to-go.html#comment-83</guid>
		<description>What about the stars Lee? :&#039;(</description>
		<content:encoded><![CDATA[<p>What about the stars Lee? :&#8217;(</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: LGR</title>
		<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html#comment-82</link>
		<dc:creator>LGR</dc:creator>
		<pubDate>Sat, 31 Mar 2007 20:39:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.blog2.lgr.ca/2007/03/tell-the-robots-where-not-to-go.html#comment-82</guid>
		<description>Rhett, Yes usually the root web folder is called www or public_html. I know for a fact on your server that is what it is called. For Wordpress, I ran across a good post the other day at &lt;a HREF=&quot;http://www.dailyblogtips.com/create-a-robotstxt-file/&quot; REL=&quot;nofollow&quot;&gt;Daily Blog Tips&lt;/a&gt; that had a good sample robots.txt file. &lt;br/&gt;&lt;br/&gt;As for where you don&#039;t want the bots to go? Look at all the files/folders in your www folder and ask yourself &quot;do you want search engines to index this folder?&quot;, if the answer is no then disallow it. This is why I often disallow image folders. I don&#039;t need the images I use to build a website to be index and stored in Google Images for example. If it is a photo gallery that is different. &lt;br/&gt;&lt;br/&gt;Remember though, by disallowing a folder in the robots.txt file, the robots will ignore the folder, but that does not mean people will, since people can read the robots.txt file. Often those you don&#039;t want reading the file will look at it to see if there are folders listed that might allow them some kind of access to your site. If you really don&#039;t want people or robots to have access to a folder I would password protect it using .htaccess. If you use cPanel this can be done easily under the Web Protect section.&lt;br/&gt;&lt;br/&gt;As for the mp3 question, as long as you own the copyright to those works, you could put them in a folder and disallow robots from reading it. That will stop robots from looking in the folder. This does not mean that people won&#039;t find them and download them.&lt;br/&gt;&lt;br/&gt;If the mp3&#039;s are others copyrighted work, don&#039;t put them on the web. Many hosts will remove your account if you have copyrighted material on your account.&lt;br/&gt;&lt;br/&gt;What&#039;s in the cgi-bin folder? These days not much since PHP has become so popular. It was used a lot more in the past for PERL scripts and other programs. There also might be reasons to not block bots from that folder. If you actually used a script that output data from that folder you might want to allow bots access. I just disallow it since it is not used much anymore.</description>
		<content:encoded><![CDATA[<p>Rhett, Yes usually the root web folder is called www or public_html. I know for a fact on your server that is what it is called. For WordPress, I ran across a good post the other day at <a HREF="http://www.dailyblogtips.com/create-a-robotstxt-file/">Daily Blog Tips</a> that had a good sample robots.txt file. </p>
<p>As for where you don&#8217;t want the bots to go? Look at all the files/folders in your www folder and ask yourself &#8220;do you want search engines to index this folder?&#8221;, if the answer is no then disallow it. This is why I often disallow image folders. I don&#8217;t need the images I use to build a website to be index and stored in Google Images for example. If it is a photo gallery that is different. </p>
<p>Remember though, by disallowing a folder in the robots.txt file, the robots will ignore the folder, but that does not mean people will, since people can read the robots.txt file. Often those you don&#8217;t want reading the file will look at it to see if there are folders listed that might allow them some kind of access to your site. If you really don&#8217;t want people or robots to have access to a folder I would password protect it using .htaccess. If you use cPanel this can be done easily under the Web Protect section.</p>
<p>As for the mp3 question, as long as you own the copyright to those works, you could put them in a folder and disallow robots from reading it. That will stop robots from looking in the folder. This does not mean that people won&#8217;t find them and download them.</p>
<p>If the mp3&#8242;s are others copyrighted work, don&#8217;t put them on the web. Many hosts will remove your account if you have copyrighted material on your account.</p>
<p>What&#8217;s in the cgi-bin folder? These days not much since PHP has become so popular. It was used a lot more in the past for PERL scripts and other programs. There also might be reasons to not block bots from that folder. If you actually used a script that output data from that folder you might want to allow bots access. I just disallow it since it is not used much anymore.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rhett</title>
		<link>http://www.lgr.ca/blog/2007/03/tell-the-robots-where-not-to-go.html#comment-81</link>
		<dc:creator>Rhett</dc:creator>
		<pubDate>Sat, 31 Mar 2007 17:27:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.blog2.lgr.ca/2007/03/tell-the-robots-where-not-to-go.html#comment-81</guid>
		<description>Lee do you ever wonder if when I leave my apartment and look up at the stars and you leave your house and you look up at the stars if we are gazing upon the same one, somehow connected through time and space... :D&lt;br/&gt;&lt;br/&gt;I swear I was just reading about this the other day and now it&#039;s here.  Well, I have a quick question.  I think what&#039;s going on here is you are talking to a certain audience that knows one more thing than I do and so I can&#039;t quite make sense of what you are saying.  &lt;br/&gt;&lt;br/&gt;If I create said text file do I just put it in the &quot;www&quot; folder?  Would this be a way for say bots not to find like an audio folder with mp3s so they don&#039;t get ripped off?  (This question might be too large) How do I know where I want bots to go and not want them to go?  What&#039;s in the cgi bin?</description>
		<content:encoded><![CDATA[<p>Lee do you ever wonder if when I leave my apartment and look up at the stars and you leave your house and you look up at the stars if we are gazing upon the same one, somehow connected through time and space&#8230; <img src='http://cdn.lgr.ca/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>I swear I was just reading about this the other day and now it&#8217;s here.  Well, I have a quick question.  I think what&#8217;s going on here is you are talking to a certain audience that knows one more thing than I do and so I can&#8217;t quite make sense of what you are saying.  </p>
<p>If I create said text file do I just put it in the &#8220;www&#8221; folder?  Would this be a way for say bots not to find like an audio folder with mp3s so they don&#8217;t get ripped off?  (This question might be too large) How do I know where I want bots to go and not want them to go?  What&#8217;s in the cgi bin?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Database Caching 14/29 queries in 0.059 seconds using disk: basic
Object Caching 489/490 objects using disk: basic
Content Delivery Network via cdn.lgr.ca

Served from: www.lgr.ca @ 2012-02-11 15:22:03 -->
