With the widespread growth of the World Wide Web a specially designed tool to search through the information available was developed called the search engine. Using both algorithms and human editing the search engine will present results organized in a list consisting of web pages, information, links, and images. These results are viewed by the user after inputting a keyword or keyword phrase in to the search engines search field.

Web crawling, indexing and searching combine in that order to obtain the most accurate results. Mass amounts of information on millions of web sites are stored and then retrieved relevant to the user’s request. A web crawler is also known as a spider, it analyzes every link and indexes all information for faster retrieval.

Words found inside the pages are extracted from the description and allocated appropriate meta tags. Meta tags are also taken from contents the webpage itself to establish its relevance. Data from the sites is collected, indexed and stored to be retrieved when it’s needed.

Companies such as Google store all or part of the source web page, while AltaVista stores every page word for word. The information stored and indexed is known as the cache, it allows for instant updating and keeps the searching filtered with ease. An important factor for a successful search engine is its ability to provide active and useable information with minimal to no linkrot. The cache also saves an archive of a removed source that can later be access by the user after the site is updated.

Search engines will examine keywords entered by the user and obtain a list of organized search results. Summaries may also accompany web links on the results page.

All the search engines look to enhance their performance by ensuring that they deliver exactly what the user looks for. The problem is accentuated by the abundance of web pages containing the keyword or the keyword phrase. However, by using web crawlers and indexing, search engines manage to filter all the sites that are irrelevant to the search being made even if it has the keywords. They have created their own unique processes for examining different web pages and their contents.

Page rank is latest addition in the techniques used by search engines to sort out various web pages and their contents. Page rank decides the relevance of a particular page by studying the correlation between its meta tags, descriptions, keywords used and the content of that webpage. The search engines rank those sites high that have association with high ranked web pages. The page rank is essential for any web page or site as it determines its probability of featuring at the top of any particular search.

About the Author: