Personalized PageRank
PageRank is a metric developed by Google founders Larry Page and Sergey Brin. PageRank is defined by the on a directed graph $G=(V,E)$ as the probability that a random walk will be on any particular node $v_i$ when walking over the edges randomly. As there may be vertices that only point to each other (spider traps) they introduced the idea of a random restart, so with probability $\beta$ the random walk with jump randomly to a new vertices which is also known as the dampening factor.
Specifically using $|V|$ as the number of vertices in the graph $G$ and $|E_j|$ as the number of outgoing edges from vertices $v_j$ also known as the out degree of $v_j$.
Let’s create a matrix formulation of this by defining a matrix $\mathbf{M}$, let $v_i$ have out degree $d_i$. If $e_{ij} \in E$ then $M_{ji} = \frac{1}{d_i}$. Thus the columns of $\mathbf{M}$ sum to 1 and $\mathbf{M}$ is a stochastic matrix, which gets to Markov properties and we’ll leave for another day (when I’m more familiar). In another notation if $\mathbf{A}$ is the adjacency matrix of graph $G$ and $\mathbf{K}$ is the a diagonal matrix with out degrees of each vertice on the diagonal them $\mathbf{M} = (\mathbf{K}^{-1}\mathbf{A})^{T}$. For notation we’ll also define $\mathbf{R} := P(v)$
With the equation (4) being true as $\mathbf{R}$ is a probability vector and sums to 1 as do the columns of $\mathbf{M}$. To make sure that $\mathbf{M}$ is stochastic it is necessary for vertices with no edges out (sinks) to split probability to $1/|V|$ for all vertices.
If we define $\widehat{\mathbf{M}} = \frac{1-\beta}{|V|}\mathbf{1} + \beta\mathbf{M}$ then we can reformulate as
Indicating that $\mathbf{R}$ is the eigenvector of $\widehat{\mathbf{M}}$ with an eigenvalue $\lambda$ of 1 and since $\widehat{\mathbf{M}}$ is stochastic that is the biggest eigenvalue.
In general calculating eigenvectors and eigenvalues is hard but one example to get only the first $k$ eigenvectors then we can use the power method which through the miracle of eigenvectors any random vector multiplied enough times by $\widehat{\mathbf{M}}$ will converge to the eigenvector of that matrix. If the eigenvalue of that eigenvector is significantly larger than the next eigenvalue then it should converge quickly.
In Personalized PageRank instead of restarting from any random vertice we only restart with probability $\beta$ from particular vertices that we want to personalize from. In the random walk framework we are seeing what are the likely vertices a walker will be at if the keep restarting from the personalized set. So instead of adding $\frac{1-\beta}{|V|}$ to every entry in $\mathbf{M}$ we only add it to particular vertices that we want to personalize from.
In particular