This video is about how Google infrastructure works and how all the involved processes fit together. More specifically, it tackles how the crawling, indexing processes and serving pipelines work. In simpler terms, this video tackles how Google search works.
Matt Cutts shares that tackling how Google search works is tantamount to knowing everything about Google. If you want to be the world’s best search engine, you have to crawl the web deeply and comprehensively, index it and return the most relevant information first. In order to achieve this, you need to know the intricate processes involving Google’s ranking and website evaluation beginning from the crawling and analysis of the site, priorities, frequencies, indexing and filtering processes within the database.
Matt Cutts points out that crawling is a process that is actually more difficult than most people think. Back in the year 2000 Google actually used to crawl websites for something like 30 days. It would crawl the sites for several weeks and then index it for another week before they pushed out the data. The graphical representation of the whole process was not pretty to look at. By the time the information was released after the indexing, it was already outdated.
Matt Cutts highlights the following in this video:
- Google takes page rank as the primary determinant when crawling sites. This means that the more page rank you have, the more number of people link to you and the more reputable these people are, the more likely that you will be found or crawled by Google.
- You can do some tricks when it comes to crawling. You can re-crawl the page with high ranks and may find data centers that contain old data and new data at times.
- In laymen’s world, indexing basically means taking things in word order. However when it comes to Google, this takes on a different meaning or rather a reversed one. You have the word and then you have the documents. Google finds the document that it thinks contain the words either in the page, back links or anchor text that points to the document.
- In 2003, Google made a switch to a more advanced technique in crawling sites. This new update shorten the period of crawling which used to be about 30 days or more before, which makes data out of date by the time it is pushed out to the public. Update fritz takes a significant chunk of the web and divides it into segments. Google crawls it every night and able to incrementally update its index. Hence, Google is able to push forward the information that a person searches for, at any given time. This update also has supplemental index which contains more documents but is not refreshed for indexing quite as often.
If you want to achieve better SEO ranking, you just need to consider the following:
- Work on getting more people who are reputable (website with high Page Rank) to link to you.
- Ensure that you build links which are reputable and not just anything else.
- Work on having SEO-targeted content that contains words or string of words (keywords) that have greater proximity to the information that people search.