What is Robots.txt?
It is nice once search engines oftentimes visit
your website and index your content however usually there are cases once
compartmentalization elements of your on-line content isn't what you wish. as
an example, if you've got 2 versions of a page (one for viewing within the
browser and one for printing), you'd rather have the printing version excluded
from creep, otherwise you risk being obligatory a replica content penalty.
Also, if you happen to own sensitive information on your website that you just
don't wish the planet to examine, you'll conjointly like that search engines
don't index these pages (although during this case the sole certain approach
for not compartmentalization sensitive information is to stay it offline on a
separate machine). in addition, if you wish to save lots of some information
measure by excluding pictures, stylesheets and JavaScript from
compartmentalization, you furthermore might would like the simplest way to inform
spiders to stay removed from this stuff.
One way to inform search engines that files and
folders on your computing machine to avoid is with the employment of the Robots
metatag. However since not all search engines browse metatags, the Robots metatag
will merely go unobserved. a more robust thanks to inform search engines
regarding your can is to use a robots.txt file.
What Is Robots.txt?
Robots.txt could be a text (not html) file you
place on your website to inform search robots that pages you'd like them to not
visit. Robots.txt is by no means that obligatory for search engines however
typically search engines conform what they're asked to not do. it's necessary
to clarify that robots.txt isn't the simplest way from preventing search
engines from creep your website (i.e. it's not a firewall, or a form of Arcanum
protection) and therefore the undeniable fact that you place a robots.txt file
are some things like golf shot a note “Please, don't enter” on Associate in
Nursing unlatched door – e.g. you can't stop thieves from returning in however
the great guys won't receptive door and enter. That’s why we are saying that if
you've got very subunit information, it's too naïve to think about robots.txt
to shield it from being indexed and displayed in search results.
The location of robots.txt is incredibly
necessary. It should be within the main directory as a result of otherwise user
agents (search engines) won't be able to realize it – they are doing not search
the complete website for a file named robots.txt. Instead, they appear initial
within the main directory (i.e. http://mydomain.com/robots.txt) and if they do
not realize it there, they merely assume that this website doesn't have a
robots.txt file and so they index everything they realize on the approach. So,
if you do not place robots.txt within the right place, don't be shocked that
search engines index your whole website.
The construct and structure of robots.txt has
been developed over a decade past and if you're interested to be told additional
regarding it, visit http://www.robotstxt.org/ otherwise you will go straight to
the quality for automaton Exclusion as a result of during this article we are
going to deal solely with the foremost necessary aspects of a robots.txt file.
Next we are going to continue with the structure a robots.txt file.
No comments:
Post a Comment