What is PageRank – shattering the myth

Written by Dave Collins, SoftwarePromotions Ltd.

One of the most fascinating aspects of the web is its dynamism. We all know that it develops at an astonishing speed – yesterday’s craze is today’s old news, and bigger and better things seem to be springing up every few days. Some of them crumble quickly into dust, while others seem destined to tower above the rest.

Naturally, search engines also follow this pattern. Some of the early search engine giants remain with us today, but many of them are gone – and every so often, a new champion seems to emerge. Recent years have seen the growth and development of a search engine that puts all others to shame. It might have once stood at the same level as its rivals, but there is no doubt that for now at least, Google rules the web.

Many of the companies we work with see more traffic from Google than all the other search engines put together, and there are more than a few Search Engine Optimisation services who focus almost exclusively on this one engine.

What is Google’s secret?
So why is Google so successful? The answer is simply that when a user goes searching on Google, they’re likely to find what they’re looking for, and more quickly than on any other search engine. Exactly how Google manages to do this is trickier to answer, as they tend to guard their secrets well. They don’t want us to know too much about how they determine their search results, simply because they don’t want anyone to be able to manipulate their own ranking.

Of course, human nature dictates that many of us aren’t satisfied with this. We desperately want to be able to affect the ranking of our sites, and some of us will go to great lengths to do so. We work hard to find the perfect keywords, tweak our meta tags and optimise the content of our site to what we hope is Google perfection.

But recently, a new word has entered our vocabulary, and is surrounded by so much hype that very few people actually have a realistic understanding of what it is – or what it isn’t. PageRank is where the attention is focused today, and many companies are determined to find a means of improving their magic number. “I want to be an eight,” they say, as if PageRank was a dress size that they could grow into with the help of some heavy-duty calorie shots. Unfortunately, it’s not quite as easy as that.

So what is PageRank? There’s a surprisingly simple answer: it is Google’s way of estimating how important a web page is. On a basic level, Google decides that if one page links to another, the second page must be considered important. If one page on one site has 15,000 pages linking to it, it must be for a good reason, right?

PageRank is about pages, not websites
Let’s begin by straightening out a few basic points. First of all, PageRank is assigned on a page-by-page basis. A whole website does not have this score, and different pages within a site can have very different PageRank values assigned. Another important point is that the rating (out of ten) assigned is essentially little more than an approximation of a given page’s PageRank. The actual values cover a far greater range than zero to ten.

Before going any further, we should take a look at the most important point of all, often overlooked when we get caught up in the PageRank frenzy. PageRank is only one factor that Google takes into account when displaying the results of a search. There are still other factors of equal significance in performing well on Google – so don’t make the mistake of thinking that you would live happily ever after if your PageRank was a little bit higher. Other factors include a page’s title, and the use of keywords within the page’s text – not in the keyword meta tag.

PageRank is still one of Google’s more ingenious strategies, and is certainly one of the many reasons that it stands head and shoulders above the rest. Partly, this is due to a combination of two factors. Firstly that the very nature of PageRank is difficult (but not impossible) to manipulate, and secondly that the exact details of how the value is assigned is a closely guarded secret.

However, there is one very useful source of data – an academic paper detailing the formula used to calculate PageRank from Google’s early beginnings as a university project. This formula will have certainly been altered and expanded over the years, but it is generally accepted that it still represents the essence of their PageRank system

The PageRank Formula
The exact details are lengthy, and far beyond what I am capable of dissecting. But the basic formula is as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ….. + PR (Tn)/C(Tn))

PR(A) is the PageRank of a particular page (A) – not a website as a whole.

1-d is the dampening factor, as explained below.

PR(T1) is the PageRank of the page that links to our (A) page, and C(T1) is the number of links contained on that same page.

The formula is repeated throughout every single page that contains a link to this (A) page.

Two important points to take into account. First of all, if you’re thinking that the formula would in practice be an infinite loop, then you’re correct. This is the very nature of the web itself, and is also why Google has introduced the so called dampening factor.

The second point concerns the way that PageRank is awarded by one page to another. The generally accepted means of understanding this is to consider that a given page has, according to its own PageRank, a certain amount of voting power. If the page in question links to five other pages, then each of the pages being linked to receive their PageRank “award” of one fifth of the original page’s voting power. It’s also worth noting that the number of links on a page includes a website’s internal links.

Link farms don’t work
This makes it quite obvious that the so-called link farms, where each page of a website contains many hundreds of links in an attempt to artificially boost so called “link popularity”, are doomed to fail from the start. In addition to this, Google has its own system for not only minimising the effect that these sites have, but eliminating it altogether. As the formula shows, PageRank works as a multiplier of a site’s overall value, so Google has made sure that link farms have their own value of zero – which means that a link from them counts for nothing, quite literally.

There is a scare story doing the rounds which claims that being listed on link popularity sites, or for that matter any site with a large number of links, can get your site penalised or even banned from Google. This is simply not the case. If it were, you’d effectively be able to wipe-out your competition’s Google presence with one afternoon’s work. It doesn’t work that way.

Having links to your web pages on sites with a low page rank and a large number of links means that the benefits are quite effectively minimised to zero. But this will not detract from your current PageRank at all.

Obviously, what people really want to know is whether PageRank can be manipulated. In the past it was often considered impossible to do so, but nowadays this is not always the case. There are two simple factors involved:
Firstly: who links to you, and how they choose to do so. Secondly: your own website’s navigation and internal links.

Clearly, the sheer number of pages linking to you will not influence your PageRank. Of far greater importance is the PageRank of each of these pages, and how many links appear on them. Common sense certainly needs to be applied here. In theory, one simple way to improve your PageRank might be to have Microsoft link to you from the front page of their website. In practice, this might be a little difficult to achieve.

It is already quite clear that linking out to another website, even if it opens in a new browser window, actually involves potentially giving away a lot more than a little space on your website. My advice would be to look at your link exchanges as you would your food. You always want to make sure you’re not leaving yourself hungry, and if you do choose to share, be selective. Exchanging a piece of your sirloin steak for a small piece of stale bread, shared between hundreds of people, is far from an even trade. If you’re doing so to help another site, as an act of charity, then this is fine and well, as long as you know what you’re giving away. Choose wisely.

Well-known websites and their PageRank
Now that we have a basic understanding of how PageRank works, let’s take a look at some of the more well-known websites on the web today, and see how their main pages perform.

Finding out a page’s PageRank is couldn’t be simpler. Follow the link to Services and Tools from the Google home page, and find the Google Toolbar. After installing the software, a bar appears at the top of the browser showing a value for each page you’re visiting. Hold the mouse over the bar, and you’ll be told the page’s PageRank – a score out of ten. As already mentioned, this figure is little more than a representation of a page’s actual PageRank.

Not surprisingly, very few pages score ten out of ten, and those that do includes the likes of Microsoft, Yahoo, Google itself, AltaVista, Adobe, AOL, Mozilla.org and others. In other words we’re looking at the biggest of the biggest websites – and not something that most of us could ever hope to achieve!

Of course, there is a simple reason that search engines and directories have such a high PageRank. Not only do they link to a huge, ever-growing list of sites and pages, but more importantly, a truly staggering number of these sites and pages link back to them. When you consider the importance of reciprocal linking, you start to understand why they do so well. With Adobe, you only need to consider the sheer number of web pages out there that link to a PDF file (with links to Adobe for their free reader software), and you will see why they have achieved such a high number.

A nine out of ten score still puts you within a very small minority of the web. Should you be able to achieve this high a PageRank, you’ll be rubbing shoulders with the likes of MSN, BBC News, Winzip and Internet.com. We’re talking about the web’s upper classes here – not really attainable for the majority of normal website owners.

Eight out of ten starts bringing you to the “reachable” web. You’ll find sites such as CNN, TuCows, Simtel, the Association of Shareware Professionals, the Shareware Industry Conference site and Lockergnome.

A PageRank of seven is starting to appear reasonably attainable, as long as we’re willing to work hard on the content and reputation of our site. The sevens include companies such as D-Link, MSNBC, CNET’s Download.com and our very own SoftwarePromotions.com.

Don’t lose your perspective!
At this point, a little perspective might be in order. A critical point to remember is that PageRank only plays a part in performing well in Google. PageRank’s primary aim involves ranking the results of a search – but in order to show up in the search to start with, your site needs to be properly optimised and have good, solid content. So contrary to popular belief, the era of Search Engine Optimisation is far from over. It’s only had a new, interesting factor thrown into it.

Finally, a note of caution. This article has been an attempt to very briefly summarise an enormously complicated subject. Aside from constraints of space, much of the workings of PageRank remain shrouded in mystery. The ideas presented are based on available data, known facts, speculation and my own experience – but none of it should be considered as insurmountable fact!

PageRank is undoubtedly an important factor in how much traffic you will receive from Google. It is, however, merely one component in your arsenal of tools to win the battle for one particular search engine. Even with the constantly evolving web, and the ever-tightening systems employed by the search engines to quantify the usefulness of a website, content is still by far the most important factor, and will invariably form the base on which everything else is built. Be seen, be sold.

Written by Dave Collins, SoftwarePromotions Ltd.