How a Search Engine Works 101

Summer 2002 CSANews Issue 43  |  Posted date : Apr 06, 2007.Back to list

I wish it were easy to search for things on the Web - someone would simply point a finger and exclaim ... there. Next to e-mail, searching the Web is the most commonly used feature of the Internet, but the amount of information can often be overwhelming.

Today's search engines are far from perfect and, therefore, require some amount of patience to use properly. There is nothing like trying to sift through 267,435 matches to your search criteria. Searching the Internet successfully requires a little skill, a little bit of art and often a little luck. Today, I am going to attempt to touch on the skill part of the equation.

You've probably heard of and/or used Yahoo!, Google, AOL and the AltaVista search engines. There are literally dozens of others available, but these are the most popular. The trick is understanding how they work. There are two categories of search engines -- directories and indexes.

Yahoo! is an example of the directory-type of search engine. It is good at identifying general information. Like a card catalogue in a library, they classify Web sites into similar categories, such as Scuba Diving Schools or Art Galleries.

If, however, you wish to locate specific information, such as biographical information about Frank Sinatra, Web indexes are the way to go, because they search the entire contents of a Web site. Indexes use software programs called spiders and robots that scour the Internet, analyzing millions of Web pages and newsgroup postings and then indexing all of the words.

Indexes such as AltaVista and Google find individual pages of a Web site that contain your search criteria, even if the site itself has nothing to do with what you are looking for. You can often find unexpected gems of information this way, but be prepared to wade through a lot of irrelevant information too.

My search engine of choice is Google, so I will show you how to use its advanced search capabilities to wade through most of the information. Let's use the example of "Canadian Snowbird Association." Without toggling "pages in Canada," your search results would be 3,580. If you toggled "pages in Canada" as shown in the diagram, your search results filter down to 740. Now here's a helpful hint...If you elect to use the advanced search option, Google will return 177 hits. The advanced search function increases the accuracy of your searches by further qualifying the search criteria using boolean (remember grade school ) mathematical operators. The ability to search for exact phrases and to add exclusions for pages with specific words will reduce your search results significantly.

Understanding how to perform sophisticated searches of online information will greatly increase your chances of finding what you want. While most search engines let you define your search criteria in very specific ways, not all function identically. Here are a few things to consider if you find yourself looking for answers.

Exact Phrase Searching
When using search terms containing more than one word in a specific order, if you enclose the words in quotation marks or toggle the exact phrase or exact text button of your search engine, the engine returns only documents containing that exact phrase. Here's an example: When searching for information on gun control legislation, using "gun control" will eliminate those documents that contain the words gun or control by themselves. This will dramatically reduce your number of "hits" and the amount of information you must review.

Wild Card Searches
If you are looking for information on gardening, you could use it as your keyword. However, if your results are limited in number (though not likely with gardening) and you want to broaden your search, use a root part of the word and simply add an asterisk (garden*). The search engine will return links to documents containing garden, gardens, gardener, gardeners and so on.

Search results may be ranked in order of relevancy--the number of times your search term appears on a Web page--or how closely the document appears to match a concept you have entered. This is a much more thorough way to locate what you want.

Bear in mind that Web sites often change. These changes are not always reflected in the search engine database, particularly for directories. Typically, Web sites are registered with search engines when they first go online. After that, changes are not generally reported. This is the responsibility of the site owners. To find the most recent information, your best bet is to use search engines that use Web-indexing robots, software that constantly searches the Internet, recording additions and changes.

The future of search engines will most likely move toward specialty search engines that could be used for specific subjects such as entertainment, government, health, sports and/or any other category.

While search engines may not always be the most up-to-date tool for finding information on the Web, it is your most effective tool. Whether you are searching for recipes, car deals, stock advice or new places to travel, finding what you are looking for is going to take some work. You would never walk into a library and find the best books open to the best pages that contain the best information. As with all technologies, search engines will continue to evolve and improve over time, and they will continue to be one of the most used resources in your Internet toolkit.

Editor's note: Another site we find very helpful is www.metacrawler.com