What is Spidering?
A search engine robot is called a “spider.” The reason being that when the robot/spider goes to a page, it follows all the links on that page. In the imagination this action resembles a many legged spider. The robot follows the links to see if the pages go where they say they go. To know what these pages are the spider reads their meta tags. Robots normally always follow down at least one level when they index a page, and many spiders return later to do a “deep crawl” and index every page they find. For this reason if you submit one page of a site to a search engine that crawls, eventually every single page will be indexed.
If you ever want a page not to be indexed you can add a line in the HEAD tags (placed between arrows, of course) like this: <META NAME=”robots” CONTENT=”noindex,nofollow”> – you can also use variations: index,nofollow or noindex,follow – depending on what you want the spider to do.
The default for no tag is that the robot will index and follow the page. A”robots.txt” file can also be used for a domain, a subject to broad to go into here, and you don’t need to know about it if you have a free site since you can’t use it anyway (it only works for domains).




No Comments, Comment or Ping