rogerbetagold.com

Search Engine Optimization

the-teoma-algorithm



 

 

Roger Gonzales

Order the Book - Articles - Partners - SiteMap - Contact - Directory - Affiliates - Home

How Inktomi Works

Inktomi is one of the most popular crawler based search engines. Inktomi is a crawler-based search engine. However, it does not make its index available to the public through its own site like other crawler-based search engines, such as Lycos or Alltheweb.

Inktomi licenses other companies to use its search index. These companies are then able to provide search services to their visitors without having to build their own index.

It uses a robot named Slurp to crawl and index web pages.

Slurp – The Inktomi Robot
Slurp collects documents from the web to build a searchable index for search services using the Inktomi search engine, including Microsoft and HotBot. Some of the characteristics of Slurp are given below:
Frequency of accesses

Slurp accesses a website once every five seconds on average. Since network delays are involved it is possible over short periods the rate will appear to be slightly higher, but the average frequency generally remains below once per minute.

robots.txt
Slurp obeys the Robot Exclusion Standard. Specifically, Slurp adheres to the 1994 Robots Exclusion Standard (RES). Where the 1996 proposed standard disambiguates the 1994 standard, the proposed standard is followed.

Slurp will obey the first record in the robots.txt file with a User-Agent containing "Slurp". If there is no such record, it will obey the first entry with a User-Agent of "*".

This is discussed in detail later in this book.

NOINDEX meta-tag

Slurp obeys the NOINDEX meta-tag. If you place

<META NAME="robots" CONTENT="noindex">

in the head of your web document, Slurp will retrieve the document, but it will not index the document or place it in the search engine's database.

Repeat downloads
In general, Slurp would only download one copy of each file from your site during a given crawl. Occasionally the crawler is stopped and restarted, and it re-crawls pages it has recently retrieved. These re-crawls happen infrequently, and should not be any cause for alarm.

Searching the results
Slurp crawls from websites to the Inktomi search engines immediately. The documents are indexed and entered into the search database in quick time.

Following links
Slurp follows HREF links. It does not follow SRC links. This means that Slurp does not retrieve or index individual frames referred to by SRC links.

Dynamic links
Slurp has the ability to crawl dynamic links or dynamically generated documents. It will not, however, crawl them by default. There are a number of good reasons for this. A couple of reasons are that dynamically generated documents can make up infinite URL spaces, and that dynamically generated links and documents can be different for every retrieval so there is no use in indexing them.

Content guidelines for Inktomi
Given here are the content guidelines and policies for Inktomi. In other words, listed below is the content Inktomi indexes and the content it avoids.

Inktomi indexes:
. Original and unique content of genuine value
. Pages designed primarily for humans, with search engine considerations secondary
. Hyperlinks intended to help people find interesting, related content, when applicable
. Metadata (including title and description) that accurately describes the contents of a Web page
. Good Web design in general

Inktomi avoids:
. Pages that harm accuracy, diversity or relevance of search results
. Pages dedicated to directing the user to another page
. Pages that have substantially the same content as other pages
. Sites with numerous, unnecessary virtual hostnames
. Pages in great quantity, automatically generated or of little value
. Pages using methods to artificially inflate search engine ranking
. The use of text that is hidden from the user
. Pages that give the search engine different content than what the end-user sees
. Excessively cross-linking sites to inflate a site's apparent popularity
. Pages built primarily for the search engines
. Misuse of competitor names
. Multiple sites offering the same content
. Pages that use excessive pop-ups, interfering with user navigation
. Pages that seem deceptive, fraudulent or provide a poor user experience

Inktomi's policies are designed to ensure that poor-quality pages do not degrade the user experience in any way. As with Inktomi's other guidelines, Inktomi reserves the right, at its sole discretion, to take any and all action it deems appropriate to insure the quality of its index.

Inktomi encourages Web designers to focus most of their energy on the content of the pages themselves. They like to see truly original text content, intended to be of value to the public. The search engine algorithm is sophisticated and is designed to match the regular text in Web pages to search queries. Therefore, no special treatment needs to be done to the text in the pages.

They do not guarantee that your web page will appear at the top of the search results for any particular keyword.


 . .....more on

How does Inktomi rank web pages

Copyright © Roger Gonzales
About The Author


Roger Gonzales is the owner of this article. To learn more visit  Search Engine Optimization http://www.rogerbetagold.com also you can check for the latest SEO articles at my personal Blog http://www.rogerbetagold.blogspot.comFree 8 Day mini-ecourse You Can Make Your Living Onlinehttp://www.protected-lessons.com  To order the ebook click here http://www.rogerbetagold.com/ebook
Anyone may republish this article electronically (in ebooks, blogs, ezines, websites, online article directories etc.)  or in print as long as the resource box above is included.


Google
 
Web www.rogerbetagold.com


Order the Book - Articles - Partners - SiteMap - Contact - Directory - Affiliates - Home
     www.rogerbetagold.com  design by MPAM Design Team

Previous Page ..... the teoma algorithm                                                            Next Page ..... How Inktomi Works