rogerbetagold.com

Search Engine Optimization

what-altavista-doesnt-index



 

 

Roger Gonzales

Order the Book - Articles - Partners - SiteMap - Contact - Directory - Affiliates - Home

What AltaVista doesn’t Index

AltaVista doesn't index everything. In fact, features that Web designers may add to sites at great expense may block crawlers, meaning that those pages will never be indexed and never be found through search engines. As a result, those sites may end up spending far more on promotion than they would have had to otherwise.

Here are some pages AltaVista doesn’t index. This only highlights the importance of using plain text for your web pages.

First, sites that require any kind of registration or password lock out AltaVista. Keep in mind that a web crawler cannot fill out a form of any kind. If you need to fill out a form to get to the next page, the crawler halts right there. If you would like to gather information about your users/members but would also like your pages to be indexed, make the registration optional.

Similarly, the AltaVista crawler cannot get content from a database, because it cannot fill out a form. If the content of your database is largely text, you might consider creating plain text static HTML pages with that same content, so it can be indexed and found.

Dynamic pages also block AltaVista spiders. While it's great to give visitors to your site unique experiences, tailored to their needs, the techniques you use to do that could stop most search engines including AltaVista from indexing your content and hence could greatly reduce your potential traffic. Dynamically generated pages are created on the fly from a variety of elements held in databases. When the AltaVista crawler arrives at such a page, it captures the content but halts immediately, and will not follow the links, because it sees ahead of it an infinite number of pages ahead -- a black hole that would bring it to a crash.

Active Server Pages (.asp) with question marks in their URLs (indicating that the page is a script for the construction of a page, rather than just static content) fall into this category.


If you have information inside frames, that will probably prove to be a hindrance, but is not an absolute barrier. AltaVista indexes the outside of the frame as a distinct page. It will also index each pane of the frame window as a separate page. That means that if the content matching a query is in a pane, when visitors clicking on those links will see the pane and only the pane -- not the full page as it was designed. So if you want visitors from search engines to experience your pages the way they were intended to be seen, you should have non-frames as well as frames versions of those pages; and submit the non-frames versions with Add URL.

AltaVista also can't index text that is embedded in graphics. Search engines simply cannot "see" the text unless the Webmaster put ALT text behind the picture, describing it and listing those important words. But pictures, as pictures, can be indexed for Image search at AltaVista.

Text that appears in multi-media files (audio and video) cannot be indexed. But those same files can be indexed at AltaVista for Multimedia search.

Information that is generated by Java applets or in XML coding cannot be indexed.
Acrobat files cannot be indexed either. But technology exists that will enable AltaVista to convert those files to indexable form.

Exceptionally large pages also present a problem at AltaVista. As a pragmatic compromise, intended to help optimize the performance of AltaVista, they fully index the first 64 Kbytes of text on any single page. They will harvest the hyperlinks from the whole document for following up later, but they will only index the first 64 Kbytes. So if you want to post an entire book, it's best to break it up into chapters, and then all the text can be indexed.

Comments, such as <!--change this every Friday-->, aren't indexed at all. Those are intended as private communications, not viewable by Web site visitors, except by using View/Page Source.

Also, consider technical factors. If a site has a slow connection, it might time-out for the crawler. Very complex pages, too, may time out before the crawler can harvest the text.
If you have a hierarchy of directories at your site, put the most important information high, not deep. AltaVista will presume that the higher you placed the information, the more important it is. And crawlers may not venture deeper than three or four or five directory levels.

Above all remember the obvious - full-text search engines such as AltaVista index text. You may well be tempted to use fancy and expensive design techniques that either block search engine crawlers or leave your pages with very little plain text that can be indexed.
 

 . .....more on

AltaVista Ranking Rules

Copyright © Roger Gonzales
About The Author


Roger Gonzales is the owner of this article. To learn more visit  Search Engine Optimization http://www.rogerbetagold.com also you can check for the latest SEO articles at my personal Blog http://www.rogerbetagold.blogspot.comFree 8 Day mini-ecourse You Can Make Your Living Onlinehttp://www.protected-lessons.com  To order the ebook click here http://www.rogerbetagold.com/ebook
Anyone may republish this article electronically (in ebooks, blogs, ezines, websites, online article directories etc.)  or in print as long as the resource box above is included.


Google
 
Web www.rogerbetagold.com


Order the Book - Articles - Partners - SiteMap - Contact - Directory - Affiliates - Home
     www.rogerbetagold.com  design by MPAM Design Team

Previous Page ..... how altavista works                                    Next Page ..... What AltaVista doesn’t Index