The Formula


The Formula - A short introduction

The principle behind PageRank calculation is that every link is a vote. The more links to a page, the better it must be - and the higher PageRank it is assigned. But - unlike a democratic society - all votes are not equal: Pages with a high PageRank have a bigger vote.

Google's formula looks like this:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) .

Let's for a moment forget about the d (the damping factor) - and the formula becomes:

PR(A) = PR(T1)/C(T1) + ... + PR(Tn)/C(Tn) .

So what it says is: In order to find the PageRank for a page, called A, we must find all the pages that link to page A. Let's say we find a page, T1, which has a PageRank of 20 and links to 5 pages, then page T1 will give 4 points (i.e. 20/5) to page A. We do this for T2, T3 and all other pages linking to page A - and add up the values.

The only snag is that page A might link back to some of these pages - or link to some pages that in turn link to these pages, and if page A's PageRank increase, so will theirs. Therefore these calculations will have to be done again and again i.e. iteratively. This is where the spreadsheet on this site comes in handy.

PageRank could also be thought of as a probability. If your page has a PageRank of 20, and there are 4.285.199.774 pages indexed by Google, it follows that the odds that a "random surfer" is reading your page right now are 20/4.285.199.774.

In this section we will explore some values on the spreadsheet that normally should be left untouched, namely the initial PageRank and the damping factor.