Every major search engine works in a very similar way, following three basic processes to find content and then return it in search results:
A brief look at these three areas will help you understand why SEO works, and why certain things are recommended during the optimisation process.
Search engines start with a 'seed' list of high quality pages, for example, the BBC or the Open Directory. This list will be of sites that are regarded as of substantially high quality, and are also likely to have editors checking the quality of external links.
A 'crawler', 'spider' or 'bot' is a computer program that automatically retrieves pages, and can identify all of the links on those pages. Crawling the seed list of sites allows search engines to discover all of the external links from them. They can then crawl all of the newly discovered sites, finding new links from those sites, and so on. Eventually, this process will allow discovery of every page that is linked to from another site.
Note that this means that if no-one links to your page (not even you!), search engines will not discover your content without manual intervention.
Crawling is a continuous process involving many thousands of computers. Google spent $5 billion on computer equipment in the second quarter of 2014!
At this stage, search engines will immediately discard certain content:
If you create a page, and there are links to it, it is highly likely that Google will store it in its database. But this alone does not mean that Google will necessarily rank your page, or that it will attract visitors.
Google crawls in excess of 30 trillion web pages. It is not feasible to 'search' this amount of content in the way you might, for instance, search for files on an individual computer. Or, at least, it would take an incredibly long time to do so.
Instead, search engines make a database (called an index) of the content, which involves:
It's at this early stage that the keywords your pages can rank for are determined (although the final order is not set). Your pages will also be given a 'score' for each word it uses, based on numerous basic criteria.
Basic on-page SEO techniques will ensure your pages pass the indexing stage and have the opportunity to rank.
For most searches there are a very large number of results that are a potential match (based on indexing, as described above):
This number is the count of pages matching the word "Google", according to the indexing process. This is still far too many results to search in a reasonable amount of time. Instead, Google only selects the top 1,000 results, based on indexing scores. At this point, the most basic criteria are still being used, but if you were not in the top 1,000 selected, nothing you do will enable you to rank in Google's results.
Once it has the top 1,000 results, Google can then perform more refined processing to come up with a final order for the results - the rankings. This process can be conducted very quickly and so more complex criteria can be used.
If you hear discussion of the 'Google algorithm' it is usually (and incorrectly) referring to this stage only. Don't make the mistake of many, and ignore the indexing stage which is equally important.
Ranking criteria are the most complicated and the least understood. These include:
Google will also use the ranking process to 'demote' pages perceived as having issues that should prevent rankings, including:
Unless you are making serious errors in your content, or are actively trying to manipulate search engine results, you should mainly be concerned with favourable indexing. Complicated ranking criteria are rarely affected by the quality of an individual page's copy.