According to WorldWideWebSize.com, as of February 11, 2011, there are at least 14,2 billion web pages currently being indexed by the four major search engines (Google, Yahoo!, Bing and Ask). Now of course no can really say for sure, but you have to admit, that would be a lot of websites to have to keep track of. It begs the question… How do search engines work? In this article, I will attempt to answer that question and how they manage to keep track of all of those websites.
You need information, you need it to be relevant and you need it now. What do you do? If you are like most people these days, you turn to a search engine, such as Google.com. A search engine is a tool designed to search for information on the World Wide Web.
Allow me this example: You get on your computer, open up your browser and type in the web address of your favorite search engine. For some that would be Google, or Yahoo! or Bing, even Ask, for some it may be another.
You then type in a word or a phrase of what ever you are looking for and the search engine returns a list of results. Some of these results are other websites, some are pictures or videos or even a map. You are then left with the choice of searching through the list to choose the information or item you want.
Tens of millions of people perform search just like this 24 hours a day, all over the entire world and but don’t have a clue how the website they trust to provide them information does it. Most don’t care and frankly, don’t need to know but if you are curious, let me help you out. In its basic form a search engine is a website located on the Internet that you can use to find information about something. That something could basically be anything: person, place or thing. On the surface all you see is the home page, referred to in computer terms as the user interface and the list that is returned to you, A.K.A as the search results.
However, behind the scenes there is so much at work. A search engine is a very complicated software program that uses hundreds of factors to determine how to rank a website(s) in its listings. These factors themselves and the weight each carries may change continually. The SEO industry refers to these factors as “Algorithms”.
How Do Search Engines Find and Collect Information?
Search Engines use a series of software programs (A.K.A. spiders) to index websites. This web robot (or automated program) “browses” your web pages, and indexes what the robot finds.
Allow me another example: When you submit your website pages to a search engine by completing their required submission page or through a linking relationship, the search engine spider will index your entire site. A ‘spider’ is an automated program that is run by the search engine system. Spider visits a web site, reads the content on the actual site, such as the site’s Meta tags and also follows the links that the site connects to.
The spider then returns all that information back to a central depository (database), where the data is indexed. It will attempt to visit each link you have on your website and index those sites as well. The spider will periodically return to the sites to check for any information that has changed. The frequency with which this happens is determined by the moderators of the search engine.
What’s this Index Stuff?
An index is almost like a book where it contains the table of contents, the actual content and the links and references for all the websites it finds during its search. A search engine may index millions of pages a day. When you ask a search engine to locate information, it is actually searching through the index which it has created and not actually searching the Internet (betcha didn’t’t know that!).
Since each search engine uses its own set of algorithms to search through the indices, the will usually produce similar, but different results listings and rankings. One of the things that a search engine algorithm scans for is the frequency and location of keywords on a web page, but it can also detect artificial keyword stuffing or spamdexing.
The the algorithms also analyze the way that pages link to other pages on the Internet. By checking how pages link to each other, an engine can determine what a page is about and, if the keywords of the linked pages are similar to the keywords on the original page.
The most popular search engines are: