The PageRank algorithm (PR for short) is a website ranking system created by Larry Page and Sergey Brin at Stanford University in the late 1990s and was in fact the basis on which they built the Google search engine.
Many years have passed since then, and of course Google’s ranking algorithm has grown and become more complex. How important is PageRank today for ranking a page higher in the results, and what is it that we should “expect” in the immediate future?
Let’s make an attempt to summarize the data we have and the “mystery” we speculate around PageRank, in order to have a clearer picture, as far as we are allowed of course.
The past of PageRank
As mentioned above, Brin and Page, in their university research project, tried to invent a system for “rating” websites. This system was based on links, which had the function of votes of confidence for a page.
The logic of the system was that the more links (known as backlinks) from external resources (other websites, forums, blog posts, etc.) leading to a page, the more “useful” that page would be for users.
That is, the more people discuss, mention and recommend a page via a link, it is reasonably concluded that the page is important.
The PageRank scale was from 0 to 10, which was calculated based on the quantity and quality of “incoming links” and indicated the so-called page authority a webpage would have on the internet.
The original formula for the calculation of the PageRank
Let’s take a closer look at how PageRank works.
ach link from page A to page B gives a so-called vote, the weight of which depends on the “collective” weight of all pages linked to page A.
The mathematical formula of the initial PageRank is as follows:
Where A, B, C and D are some pages, L is the number of links coming from each of them and N is the total number of pages in the collection (i.e. on the Internet).
As far as d is concerned, d is the so-called damping factor. Considering that PageRank is calculated by simulating the behavior of a user who randomly arrives at a page and clicks on links, we apply this damping factor d as the probability that the user gets bored and leaves a page.
As you can see from the formula, if there are no other pages “showing” a page, its PR will not be zero but :
As there is a possibility that the user will reach this page not from some other pages but, let’s say, from bookmarks.
Yes, you are right! Mathematical formulas can be a bit “confusing”, but you don’t need to “dwell” on them in particular.
The manipulation of PageRank and Google’s “war” against link spam
It was just a matter of time, like any system, Google’s Pagerank was not perfect. Its “obvious” vulnerabilities were exploited and manipulated with practices that – later – were called black hat seo.
Originally the PageRank was publicly visible to everyone, using the Google Toolbar, each page had its score from 0 to 10, probably on a logarithmic scale.
Google’s ranking algorithms at that time were quite simple, high PageRank and repetition of keywords in the text were the only criteria a page needed to rank high in a SERP (Search Engine Results Page). As a result, they were keyword stuffed websites and site owners began to manipulate PageRank with unwanted backlinks development techniques.
This was relatively easy to do, which is why specific automation and backlink building programs were developed, and a large “market” of mass link building sales was created which gave a significant “helping hand” to anyone wishing to exploit the weakness of the algorithm of the time.
Google decided to react and go on the offensive, in 2003, Google punished the ad company’s website SearchKing for manipulation. SearchKing sued Google, but Google won the lawsuit, but it didn’t actually stop the trend which was gaining more and more fans.
It’s worth mentioning and pointing out a few things, it is true that those who were involved at the time (and not only) in algorithm manipulation practices, it is not necessarily because it was their first choice.
But when you see that the competition using – to a large extent – such practices, keep going unpunished and you sit back and watch, wasting time and money, it seems inevitable to use the same means and practices in conditions of an unorthodox “normality” which, although informally, is in a way “normalized“.
Ethical or not is an issue that is still a concern even today in the formulation of our SEO strategy, which is discussed in more detail in our article on grey hat seo.
We can now continue after that quick mention…
Another “weapon” introduced by Google in the war against link manipulation was about to be turned against itself again.
Another “weapon” introduced by Google in the war against link manipulation was about to be turned against itself again.
The reason is the nofollow tags, which Google introduced in 2006. It was given to website owners to determine which links from their pages would be tagged dofollow and which nofollow (i.e. they would not count towards the PR of the page).
The purpose of this feature was to limit the spam backlinks that were created mainly on blog posts, on comments left by “users” (actually some bots) under blog posts on wordpress sites and blogs such as blogspot.com.
Some page owners however did not take long to realize that they can manipulate this feature to their advantage, a tactic called PageRank sculpting, a simple example of its use is when we have a page with 5 links to our other pages, wanting to have all the “benefit (juice as it is called)” directed to only 2 of the 5, we use the nofollow tag for the 3 links and dofollow for the 2 we are interested in.
The algorithm calculated the PR based only on dofollow links, and this was the “secret” from which those using PageRank sculpting exploited.
In 2009 Google started to calculate the “weight” of dofollow links differently as you will see in the following graph :
But it didn’t stop there, it went on to remove the PR (PageRanK) from public view, so no one is now able to know the actual PageRanK of a page except for itself.
In the new Google Chrome released, they removed the Google Toolbar which also displayed the controversial PR score.
It then stopped reporting on the PR score and Google Search Console. This was followed by the Firefox browser which stopped supporting the Google Toolbar as well.
In 2013, PageRank was updated for Internet Explorer for the last time, and in 2016 Google officially closed the Google Toolbar to the public.
Another way Google used to combat link manipulation was the Penguin update, which removed the ranking of websites based on their backlink profile.
Penguin, released in 2012, did not become part of Google’s real-time algorithm, but was rather a “filter” that was updated and re-applied to search results every so often.
If a website was penalized by Penguin, its link profile would have to be carefully examined so “toxic” links would be removed or be added to a disavow list (a feature introduced in those days to tell Google which inbound links to ignore when calculating Page Rank).
After checking the link profiles in this way, it would take about half the time for the Penguin algorithm to recalculate the data.
In 2016, Google made Penguin part of its core ranking algorithm. Since then, it has been working in real-time, algorithmically tackling spam backlinks much better than before.
At the same time, Google gave more weight to the quality rather than the quantity of links.
PageRank nowadays
Well, references to PageRank’s past are good and all. But what’s happening right now?
In 2019, a former Google employee reported that the original PageRank algorithm had not been used since 2006 and was replaced by another algorithm that required “fewer resources” to “run” as the Internet grew.
Which may have a basis as Google filed its new “patent” in the same year with the title of “Producing a ranking for pages using distances in a web-link graph”
Is the PageRank algorithm currently in use?
The answer is yes. It’s not the same PageRank as it was in the early 2000s, but Google still relies heavily on link authority, namely it still places a lot of weight on backlinks.
For example, another former Google employee, Andrey Lipattsev, mentioned this in 2016, In a Google Q&A, a user asked him what were the main ranking signals that Google used. Andrey’s answer was pretty straightforward.
In 2020 John Mueller confirmed it once again:
As you can see, PageRank is still alive and well and is actively used by Google when ranking pages on the web.
What’s interesting is that Google officials continue to remind us at every opportunity they get that there are many, many, MANY other ranking signals and that backlinks are like a grain of salt in the salt shaker.
Considering how much effort Google has devoted to fighting unwanted links, it might be in Google’s best interest to shift SEO (Search Engine Optimizer) attention away from the vulnerable manipulative factors (like backlinks) and turn that attention to something innocent and ideal…
But as SEOs (Search Engine Optimizers) have necessarily developed the “talent” to read “between the lines” an attempt to misdirect – possibly – would not have much success, they still consider PageRank a strong ranking signal and develop backlinks in all the ways they can, either by conviction or by necessity as we mentioned earlier.
They are still using PBN (private blog networks), using more Grey Hat SEO practices, buying links and so on, as has been the case for years.
As long as PageRank lives, spam backlinks will continue to exist.
Practices that we do not recommend, but this is today’s reality in SEO, we should both know and understand it.
Models of PageRank Random Surfer vs. Reasonable Surfer
It has been understood that although PR is valid and in fact has particular importance in the ranking of our pages, it is not the same PR as it was 20 years ago.
One of the key modernisations of PR was the move from the briefly mentioned Random Surfer model to the Reasonable Surfer model in 2012. The Reasonable Surfer assumes that users do not behave chaotically on a page and only click on those links that interest them at the time.
Say, when reading a blog article, you are more likely to click on a link in the content of the article instead of a Terms of Use link in the footer.
In addition, Reasonable Surfer can potentially use a wide variety of other factors when evaluating the “attractiveness” of a link.
These factors were carefully examined by Bill Slawski in his article, but I’d like to focus on the two factors that SEOs (those involved in optimizing a page’s content) discuss most often.
These are firstly the position of the link and secondly the traffic of the page. What can we say about these two factors?
Correlation between link position and link start
A link can be located anywhere on the page – in its content, in the navigation menu, in the author’s bio, in the footer, and in fact in any structural element on the page. Different link locations affect its value.
John Mueller confirmed this by saying that links placed in the main content “carry more weight” than all the others:
Footer links and navigation links are said to have “less weight“. Something that is occasionally confirmed not only by Google representatives but also by actual tests.
In a recent case presented by Martin Hayman at BrightonSEO, Martin added the link he already had in his navigation menu to the main content of his pages. As a result, those category pages and the pages they were linked with had a 25% increase in traffic.
The results slide :
This experiment proves that content links carry more importance than any other.
As for links from the author’s biography, we assume that they have a certain “importance” but not particularly significant, they are less valuable, say, than content links.
FULL PRESENTATION BY MARTIN HAYMAN
Correlation between traffic, user behaviour and link authority
John Mueller clarified how Google deals with traffic and user behaviour in terms of passing on the PR benefit of a page via a link to another page, in one of the Search Console Central hangouts.
One user asked Mueller an excellent question about whether Google tries to somehow calculate the likelihood and number of clicks for a link in the process of evaluating the quality of that link.
Mueller replied the following :
Google does not take into account the likelihood of clicks, nor does it make any “estimates” of their number for a link in the process of evaluating the quality of the link.
Google understands that links are often added to content by referral logic, and users are not expected to click on every link they come across.
However, as always, the SEO community believes that Google has every reason not to be completely transparent in its formulations, for obvious reasons, so “experimentation” is our favourite pastime.
So, in an “experimental” mood, a team from Ahrefs conducted a search to test whether the position of a page in an SERP is related to the number of backlinks it has from high-traffic pages.
The study revealed that there is almost no correlation. Moreover, some top pages turned out to have no backlinks at all from high-traffic pages.
Nofollow, sponsored and UGC tags
As we mentioned earlier, Google introduced nofollow tags in 2005 as a way to combat backlink spam.
Has anything changed today? The answer is yes!
First of all, Google recently introduced two more types of the nofollow attribute.
Before that, Google suggested that you tag all backlinks that you don’t want to participate in the calculation of a page’s ranking as nofollow, whether they are blog comments or paid ads.
Today, Google recommends using :
rel=”sponsored»: for “sponsorships”, “paid listings” and “affiliate links”
As well as
rel=”ugc” : for content created by users or visitors to your pages
These two new tags are not mandatory (at least not yet) and Google points out that you do not need to manually change all rel=”nofollow” to rel=”sponsored” and rel=”ugc”.
These two new features now work in the same way as a regular nofollow tag.
Outbound links and their influence on the ranking of a page
Apart from inbound links (backlinks), there are also outbound links, i.e. links that lead to pages other than your own.
More than a few people adopt the view that outbound links can also influence the ranking of a page, but this assumption is treated by the majority of the community as just another myth around SEO – yes, there are many -.
But there is an interesting study that is worth a look.
Reboot Online conducted an experiment in 2015 and repeated it in 2020. They wanted to understand if the presence of outbound links on high authority pages affected the page’s position in a SERP.
They created 10 websites with 300-word articles, all optimized for a non-existent keyword – Phylandocic.
5 websites had no outbound links at all and the other 5 sites contained outbound links to high-authority pages.
As a result, those websites with valid outbound links started to rank the highest and those with no links at all ranked the lowest.
On the one hand, the results of this research can tell us that outbound links have a positive influence on page rankings.
On the other hand, the search term in the research is brand new and the content of the websites is about medicine and drugs.
Therefore, there are strong chances that the question will be classified as YMYL (“Your Money or Your Life”).
And Google has stated many times the importance of E-A-T (Expertise, Authoritativeness, Trustworthiness) for YMYL sites.
Thus, external links could very well be treated as an E-A-T mark, proving that the pages have truly accurate content and receive a “more favorable” rating from Google for these reasons.
Regarding common queries (not YMYL), John Mueller has said many times that you don’t need to be afraid to link to external sources from your content, as outbound links are good for your users.
In addition, outbound links can be beneficial for SEO as well, as they may be taken into account by Google AI (artificial intelligence) when filtering the web for spam links. Because spammy pages tend to have very few, if any, outbound links.
They either link to pages on the same domain (if they act for SEO) or contain only paid links.
So, if you link to some trusted resources, you are showing Google that your page is not spammy.
There was once a view that Google could impose a certain “penalty” for using excessive outbound links, but John Mueller said that this is only possible when the outbound links are “obviously” part of a link exchange system, as well as the website being generally of poor quality.
Of course, what the word “obviously” means to Google is a mystery, so keep in mind common sense, we always aim for high quality content and perform basic SEO on our pages.
Google’s battle against link spam
Logically, as long as PageRank exists, new ways of manipulating it will be sought.
In 2012, Google was more likely to provide instructions for manual actions to handle links and spam mail.
Today, however, with the now well “trained” anti-spam algorithms, Google is able to ignore certain unwanted links when calculating PageRank instead of downgrading the entire site. As John Mueller said,
The same applies to “negative SEO“, because living in a world, where the only certainty is that, it is not angelically created, there is always the possibility of a “provocative intervention” in our backlink profile by a “bad” competitor of ours.
However, this does not mean that there is no cause for concern. If our website’s backlinks are ignored too much and too often, we still stand a good chance of being impacted. As Marie Haynes states in her tips for link management in 2021:
To be able to identify if and which links are potentially causing a problem, you can use a backlink profile checker such as BACKLINK PROFILE ANALYSIS..
With which you can see if there is a risk of a Google penalty on your site’s backlinks. With this service you receive – among other things – a detailed description of the high and medium risk backlinks to which you should pay special attention and possibly take certain measures such as asking the administrator of the page where your backlink exists to delete it, or create a Disavow backlink(s) List, informing Google to ignore these toxic backlinks in order to avoid any repercussions.
Internal linking
Speaking of PageRank, we’d be remiss not to mention internal links.
Inbound PR is something we can’t control completely, meaning we can’t control who, when and where a backlink will be created for one of our pages.
We can however, have relatively complete control over the way PR is spread through the pages of our site.
Just like an external link, a backlink from another website will “transfer” a “weight” to our own website, a page with a high PR will transfer a “weight” to another page on our website through internal links.
Google has also stated the importance of internal linking many times. John Mueller emphasized it once again in one of the latest Search Console Central Hangouts. One user asked how to make some websites more powerful. And John Mueller said the following:
The internal connection means a lot. It helps you share inbound PageRank between different pages on your website, boosting your low-performing pages and making your site stronger overall.
Regarding the approach to internal links, from the SEO side there are several theories, a quite popular one is the approach that has to do with the “website click depth“, an idea that wants all the pages of our website to be at most 3 clicks away from the home page.
An idea that is not unfounded, since Google has pointed out its “preference” for pages with less “depth”, the importance of the “shallow” structure of a website. In practice, however, it’s something that cannot be applied to large sites with a variety of content.
Another approach is based on the concept of centralized and decentralized internal linking. As Kevin Indig describes it.
This approach assumes, if we are talking about centralized internal linking, a small group of conversion pages, or even a page with high PR if we apply decentralized internal linking, then all pages should have equally high PR.
Which option is better? It all depends on the specifics of your website and your business, as well as the keywords you are going to target.
For example, centralized internal linking is best suited to keywords with high and medium search volumes, as it leads to a narrow set of extremely powerful pages.
Unlike keywords, key-phrases are best managed with decentralized internal linking as they spread PR across multiple pages equally.
Another aspect of successful internal linking is the balance of inbound and outbound links on the page.
PR is in a way the “power” we receive and CheiRank (CR) is the corresponding “power” we transfer through links, a reverse PageRank you might say.
By calculating the PR and CR for your pages, you can see which pages have link anomalies, i.e. cases where a page receives a lot of PageRank but goes a little further and vice versa.
An interesting experiment was done by Kevin Indig, who tried to eliminate these “anomalies” on his pages by making sure their PageRank was balanced.
The results were very impressive: (The arrow shows where the changes were made)
Link anomalies are not the only thing that can hurt the flow of PageRank.
Make sure you’re not stuck with technical issues that could destroy your hard-earned PageRank:
- – “Orphan” pages : Orphan pages are not linked to any other page on your website, so they simply sit idle and receive no benefit from any link(link juice).
- Links in JavaScript that cannot be analyzed : As Google cannot “read” them, they will not pass their PageRank
- 404 links: Links with a 404 error lead nowhere, therefore PageRank, as you can understand, doesn’t go anywhere.
- Links to non-“important” pages: Of course, you can’t leave any of your pages without any links at all, but if a page is “less” important, there’s no reason to put too much effort into optimizing that page’s link profile.
The future of PageRank
This year PageRank turns 22 years old. It is very likely to be older than some of you :).
But what’s coming for PageRank in the future? Will it completely disappear one day?
Search engines without backlinks
This year PageRank turns 22 years old. It is very likely to be older than some of you :).
But what’s coming for PageRank in the future? Will it completely disappear one day?
Could there be a search engine – at the level of Google – without including backlinks in its algorithm?
An interesting “experiment” was carried out by the search engine Yandex in 2014. Which announced that :
The rejection of backlinks by its algorithm may stop the manipulation of backlinks and turn the attention of creators and webmasters to what is the real task at hand: the creation of quality content.
Maybe it was a serious attempt to find alternative ranking factors or a diversion to stop link spamming in a way, we may never know the truth, Yandex’s “experiment” didn’t last long, a year later they announced that backlinks were reintegrated into its algorithm.
But why are backlinks so essential for search engines?
While there is endless other data for rearranging search results after they start appearing (such as user behavior and BERT adjustments), backlinks remain one of the most reliable criteria needed to shape the initial SERP.
What was Bill Slawski response when he was asked about the future of PageRank:
However, Google does not want to discard something in which it has invested decades of development.
Another point Bill made was the Pagerank of news pages and general sources like Twitter where the time of publication is more important than its timelessness.
A news item has a relatively short duration in the displayed results, as long as it is up-to-date, so Google does not evaluate news and current affairs in the same way, giving the same “weight” to their backlinks (due to their “short lifespan” it is logical that they do not have time to create enough), it continues to develop other evaluation and ranking factors for the specific categories of results.
Currently, the greatest “weight” for the ranking of news is “authoritativeness” which replaces the “weight” of backlinks in other result categories.
The term “authoritativeness” refers to pages that demonstrate expertise, validity and credibility on a given topic.
This suggests that there is probably a way in the future to pass the same or a similar rating system that leaves out backlinks or changes the “weight” correlation in which backlinks still determine the balance today.
The new rel = “sponsored” and rel = “UGC” tags
Last but but equally important, is that Google tries to recognize user-generated or sponsored links from other nofollow links.
The question is if they are indeed all treated as nofollow, why is there a “need” to distinguish them?
One hypothesis is that Google wants to test whether and how much these categories of nofollow links (“sponsored” and “UGC”) can be a “positive” signal for a page’s ranking.
Anyway, advertising on popular sites and platforms is expensive, so it can be deduced that anyone who spends large amounts of money on advertising is indeed an important “brand”, something similar is rumored to be true with Google ads anyway, that they are a positive signal for the strengthening of a brand.
Also the content created by users – visitors and includes links, if we exclude spam, the rest of the references have to do with suggestions and therefore mainly positive evaluations which – objectively – you cannot ignore.
However, some experts have their doubts about this connotation.
Epilogue
It is true that I was also surprised but at the same time fascinated, in a way, by the magic and complexity of some of the data I encountered in writing this article, I hope it will be useful to you, especially if it clarified some “blurry ” points for you.
Do you have any unanswered questions? What are your opinions or ideas about the future of PageRank?
Leave your comments on anything relevant, I’ll be happy to discuss it.
.