Google PageRank: An Overview for SEO Novices (Celebration Retrospection)
Enjoy the holidays by reading some of SEJ's top articles from 2023.
From December 21 to January 5, our Festive Flashback series offers daily articles on important occasions, principles, doable tactics, and expert viewpoints.
The SEO market has seen a lot of events in 2023, and our writers created some excellent articles to stay up with and reflect these developments.
Take a look at the top books of 2023 so you have plenty to think about in 2024.
Once upon a time, PageRank was the foundation of search and helped build Google into the massive company it is today.
There's no doubting that PageRank has long been a dominant idea in the search market, even if you think it's moved on.
Any expert in search engine optimization should be well-versed in PageRank's history and current state.
This essay will discuss:
- What is PageRank?
- The history of how PageRank evolved.
- How PageRank revolutionized search.
- Toolbar PageRank vs. PageRank.
- How PageRank works.
- How PageRank flows between pages.
- Is PageRank still used?
What Does PageRank Mean?
PageRank, an algorithm developed by Google founders Larry Page and Sergey Brin, is based on the total relative strengths of all hyperlinks on the Internet.
While some contend that the name was inspired by Larry Page's last name, others assert that "Page" is short for "web page." Both points of view are certainly accurate, and the overlap was most likely deliberate.
The PageRank Citation Ranking: Bringing Order to the Web is a paper that Page and Brin published while attending Stanford University.
The study, which was published in January 1999, presents a rather straightforward algorithm for assessing a web page's strength.
The work was eventually granted a patent in the United States (mathematical formulas are not patentable in Europe).
Google has been granted the patent by Stanford University. As of right now, the patent will expire in 2027.
The Evolution of PageRank Through Time
Brin and Page both studied information retrieval techniques in the late 1990s while they were at Stanford.
Using links to determine the relative "importance" of each page was a novel approach to page ordering at the time. Though computationally challenging, it was not insurmountable.
Google, a mere blip in the search engine world at the time, swiftly evolved from the notion.
Some parties had such strong institutional support for Google's strategy that the company launched its search engine without any revenue-generating capacity.
Additionally, PageRank was the algorithm that Google (formerly known as "BackRub") used to rank pages in the search engine results pages (SERPs).
Google Dancing
Even though the arithmetic for PageRank was straightforward, it required iterative processing, which presented a hurdle. The computation is performed several times over all of the Internet's pages and links. Processing this math takes several days at the start of the new millennium.
During that period, the Google SERPs fluctuated in position. Since each page was receiving a new PageRank, these modifications were frequently unpredictable.
Known as the "Google Dance," it was infamous for stopping today's SEO experts in their tracks whenever Google released a new update.
(Eventually, the Google Dance evolved into the moniker of an annual event held at Google's Mountain View offices for SEO specialists.)
Reliable Seeds
Rather than starting the process with the same value for every page on the Internet, a subsequent version of PageRank included the concept of a "trusted seed."
Sensible Surfer
The concept of a "reasonable surfer" was added in a later version of the model.
According to this approach, a page's PageRank may not be distributed equally among the pages it links to; rather, it may weight each link according to the likelihood that a user will click on it.
PageRank's Recession
As links to a website acted as a kind of "voting system," determining a page's relevance in addition to its content, Google's algorithm was once thought to be "unspam-able" internally.
But Google's assurance was short-lived.
As the backlink industry expanded, PageRank began to cause issues. As a result, Google removed it from the public eye but kept using it in its ranking algorithms.
By 2016, the PageRank Toolbar had been removed, and finally PageRank was no longer available to the general public. However, by this point, SEO tools like Majestic in particular had established a strong correlation between its own calculations and PageRank.
Up until January 2017, Google discouraged SEO specialists from manipulating links by guidance from its spam team, led by Matt Cutts, and its "Google Guidelines" literature.
During this period, Google's algorithms were also evolving.
With the acquisition of MetaWeb and its proprietary Knowledge Graph (dubbed "Freebase") in 2014, the business was depending less on PageRank, and Google began to index the world's knowledge in new ways.
PageRank vs. Toolbar PageRank
At first, Google was so pleased with their algorithm that it was willing to make the computation's outcome available to anybody who asked to see it.
The most famous example was a toolbar addon for Firefox and other browsers that displayed a score ranging from 0 to 10 for each website on the Internet.
Although PageRank actually has a much larger range of scores, the 0–10 range offered consumers and SEO experts a quick way to determine the significance of any page on the Internet.
Complications also arose from the algorithm's extreme visibility thanks to the PageRank Toolbar. Specifically, it meant that it was evident that the simplest method to "game" Google was to use links.
For every given targeted phrase, a page's likelihood of ranking higher in Google's SERPs increases with the number of links on it—or, more precisely, the quality of the links.
This indicated the emergence of a secondary market for the purchase and sale of links based on the PageRank of the selling URL.
Yahoo made this issue worse by releasing Yahoo Search Explorer, a free tool that anybody could use to start looking for links on any website.
Later, two tools—Moz and Majestic—improved upon the free option by creating their own online indexes and conducting independent link evaluations.
How Search Was Revolutionized by PageRank
Other search engines made extensive use of page-by-page content analysis. These techniques did not do a good job of distinguishing between a page that was authored with random or manipulative material and one that was impactful.
This meant that search engine optimization experts could easily modify the retrieval strategies of other search engines.
Thus, Google's PageRank algorithm was groundbreaking.
Google discovered a winning technique when they combined it with the relatively straightforward idea of "nGrams" to assist establish relevancy.
It quickly surpassed the dominant players of the day, including AltaVista and Inktomi (which, among other things, powered MSN).
Google also discovered that working at the page level was far more scalable than the "directory" based strategy used by Yahoo and DMOZ in the end, even though DMOZ (also known as the Open Directory Project) was able to give Google access to an open-source directory at first.
How PageRank Operates
Although there are several variations to the PageRank formula, it may be summed up in a few phrases.
Every webpage on the internet is initially assigned an approximate PageRank score. Any number might be this. In the past, PageRank was publicly displayed as a score ranging from 0 to 10, yet in actuality, the assessments do not always begin in this range.
Next, a smaller proportion is obtained by dividing the PageRank of that page by the total number of links pointing to it.
The connected pages then receive the PageRank; this process is repeated for all other pages on the Internet.
The new PageRank estimate for each page in the subsequent algorithm iteration is then calculated as the total of all the fractions of pages that link to that particular page.
Additionally, a "damping factor"—described as the likelihood that an individual web surfer may give up entirely—is included in the algorithm.
The damping factor is applied before the algorithm begins a new iteration, lowering the suggested PageRank.
This process is carried out repeatedly until the PageRank scores stabilize. For simplicity, the resulting numbers were then typically transposed into the more identifiable range of 0 to 10.
Here's one mathematical representation of this:
Where:
- PR = PageRank in the next iteration of the algorithm.
- d = damping factor.
- j = the page number on the Internet (if every page had a unique number).
- n=total number of pages on the Internet.
- i = the iteration of the algorithm (initially set as 0).
Another way to describe the formula is in matrix form.
Issues And Revisions To The Equation
There are various difficulties with the formula.
The formula will not come to an equilibrium if a page has no links to other pages.
As a result, in this scenario, PageRank would be shared by all Internet pages. In this manner, a page without any inbound connections could still gain some PageRank, but not enough to matter.
Although they might be more significant than older pages, newly created pages will have a lower PageRank, which is another issue that is less well-documented. This implies that outdated content may eventually have an abnormally high PageRank.
The algorithm does not take into account the duration of a page's existence.
The Way PageRank Moves Across Pages
Every page that a page connects to receives 0.5 PageRank (minus the damping factor) if the page has 10 links out of a starting value of 5.
Between iterations, the PageRank is distributed over the Internet in this manner.
New pages on the Internet have very little PageRank when they first appear. However, these pages' PageRank rises with time as more pages begin to connect to them.
Are We Still Using PageRank?
PageRank was closed to the public in 2016, although search engineers at Google may still have access to the score.
PageRank was still one of the parameters Yandex might utilize, according to a disclosure of its internal calculations.
According to Google developers, a new approximation that uses less processing resources to calculate has replaced the old version of PageRank. Although the method has less of an impact on how Google ranks pages, it is still the same for every website.
And PageRank probably still makes an appearance in a lot of Google's systems today, regardless of any other algorithms the search giant choose to use.
Original Patents and Documents for a Detailed Reading
- Method for node ranking in a linked database.
- The PageRank Citation Ranking: Bringing Order to the Web
- The Anatomy of a Large-Scale Hypertextual Web Search Engine
Additional resources:
- Information Retrieval: An Introduction For SEOs
- Get to Know the Google Knowledge Graph & How it Works
- Advanced Technical SEO: A Complete Guide
Comments