Price's model

From HandWiki

Price's model (named after the physicist Derek J. de Solla Price) is a mathematical model for the growth of citation networks.[1][2] It was the first model which generalized the Simon model[3] to be used for networks, especially for growing networks. Price's model belongs to the broader class of network growing models (together with the Barabási–Albert model) whose primary target is to explain the origination of networks with strongly skewed degree distributions. The model picked up the ideas of the Simon model reflecting the concept of rich get richer, also known as the Matthew effect. Price took the example of a network of citations between scientific papers and expressed its properties. His idea was that the way an old vertex (existing paper) gets new edges (new citations) should be proportional to the number of existing edges (existing citations) the vertex already has. This was referred to as cumulative advantage, now also known as preferential attachment. Price's work is also significant in providing the first known example of a scale-free network (although this term was introduced later). His ideas were used to describe many real-world networks such as the Web.

The model

Basics

Considering a directed graph with n nodes. Let [math]\displaystyle{ p_k }[/math] denote the fraction of nodes with degree k so that [math]\displaystyle{ \sum_{k}{p_k}=1 }[/math]. Each new node has a given out-degree (namely those papers it cites) and it is fixed in the long run. This does not mean that the out-degrees can not vary across nodes, simply we assume that the mean out-degree m is fixed over time. It is clear, that [math]\displaystyle{ \sum_{k}{kp_k}=m }[/math], consequently m is not restricted to integers. The most trivial form of preferential attachment means that a new node connects to an existing node proportionally to its in-degrees. In other words, a new paper cites an existing paper in proportional to its in-degrees. The caveat of such idea is that no new paper is cited when it is joined to the network so it is going to have zero probability of being cited in the future (which necessarily is not how it happens). To overcome this, Price proposed that an attachment should be proportional to some [math]\displaystyle{ k+k_0 }[/math] with [math]\displaystyle{ k_0 }[/math] constant. In general [math]\displaystyle{ k_0 }[/math] can be arbitrary, yet Price proposes a [math]\displaystyle{ k_0=1 }[/math], in that way an initial citation is associated with the paper itself (so the proportionality factor is now k + 1 instead of k). The probability of a new edge connecting to any node with a degree k is

[math]\displaystyle{ \frac{(k+1)p_k}{\sum_k (k+1)p_k}=\frac{(k+1)p_k}{m+1} }[/math]

Evolution of the network

The next question is the net change in the number of nodes with degree k when we add new nodes to the network. Naturally, this number is decreasing, as some k-degree nodes have new edges, hence becoming (k + 1)-degree nodes; but on the other hand this number is also increasing, as some (k − 1)-degree nodes might get new edges, becoming k degree nodes. To express this net change formally, let us denote the fraction of k-degree nodes at a network of n vertices with [math]\displaystyle{ p_{k,n} }[/math]:

[math]\displaystyle{ (n+1)p_{k,n+1}-np_{k,n}=[kp_{k-1,n}-(k+1)p_{k,n}]\frac{m}{m+1}\text{ for }k\geq 1, }[/math]

and

[math]\displaystyle{ (n+1)p_{0,n+1}-np_{0,n}=1-p_{0,n}\frac{m}{m+1}\text{ for }k=0. }[/math]

To obtain a stationary solution for [math]\displaystyle{ p_{k,n+1}=p_{k,n}=p_k }[/math], first let us express [math]\displaystyle{ p_{k} }[/math] using the well-known master equation method, as

[math]\displaystyle{ p_k=\begin{cases}[kp_{k-1}-(k+1)p_k]\frac{m}{m+1} & \text{for } k\geq 1 \\1-p_0\frac{m}{m+1} & \text{for } k=0\end{cases} }[/math]

After some manipulation, the expression above yields to

[math]\displaystyle{ p_0=\frac{m+1}{2m+1} }[/math]

and

[math]\displaystyle{ p_k=\frac{k!}{(k+2+1/m)\cdots(3+1/m)}p_0=(1+1/m)\mathbf{B}(k+1,2+1/m), }[/math]

with [math]\displaystyle{ \mathbf{B}(a,b) }[/math] being the Beta-function. As a consequence, [math]\displaystyle{ p_k\sim k^{-(2+1/m)} }[/math]. This is identical to saying that [math]\displaystyle{ p_k }[/math] follows a power-law distribution with exponent [math]\displaystyle{ \alpha=2+1/m }[/math]. Typically, this places the exponent between 2 and 3, which is the case for many real world networks. Price tested his model by comparing to the citation network data and concluded that the resulting m is feasible to produce a sufficiently good power-law distribution.

Generalization

It is straightforward how to generalize the above results to the case when [math]\displaystyle{ k_0\neq 1 }[/math]. Basic calculations show that

[math]\displaystyle{ p_k=\frac{m+k_0}{m(k_0+1)+k_0}\frac{\mathbf{B}(k+k_0,2+k_0/m)}{\mathbf{B}(k_0,2+k_0/m)}, }[/math]

which once more yields to a power law distribution of [math]\displaystyle{ p_k }[/math] with the same exponent [math]\displaystyle{ \alpha=2+k_0/m }[/math] for large k and fixed [math]\displaystyle{ k_0 }[/math].

Properties

The key difference from the more recent Barabási–Albert model is that the Price model produces a graph with directed edges while the Barabási–Albert model is the same model but with undirected edges. The direction is central to the citation network application which motivated Price. This means that the Price model produces a directed acyclic graph and these networks have distinctive properties.

For example, in a directed acyclic graph both longest paths and shortest paths are well defined. In the Price model the length of the longest path from the n-th node added to the network to the first node in the network, scales as[4] [math]\displaystyle{ \ln(n) }[/math]

Notes

For further discussion, see,[5][6] and.[7][8] Price was able to derive these results but this was how far he could get with it, without the provision of computational resources. Fortunately, much work dedicated to preferential attachment and network growth has been enabled by recent technological progress[according to whom?].

References

  1. de Solla Price, D. J. (1965-07-30). "Networks of Scientific Papers". Science (American Association for the Advancement of Science (AAAS)) 149 (3683): 510–515. doi:10.1126/science.149.3683.510. ISSN 0036-8075. PMID 14325149. Bibcode1965Sci...149..510D. 
  2. de Solla Price, Derek J. (1976), "A general theory of bibliometric and other cumulative advantage processes", J. Amer. Soc. Inform. Sci. 27 (5): 292–306, doi:10.1002/asi.4630270505 
  3. Simon, Herbert A. (1955). "On a class of skew distribution functions". Biometrika (Oxford University Press (OUP)) 42 (3–4): 425–440. doi:10.1093/biomet/42.3-4.425. ISSN 0006-3444. 
  4. Evans, T.S.; Calmon, L.; Vasiliauskaite, V. (2020), "The Longest Path in the Price Model", Scientific Reports 10 (1): 10503, doi:10.1038/s41598-020-67421-8, PMID 32601403, Bibcode2020NatSR..1010503E 
  5. Dorogovtsev, S. N.; Mendes, J. F. F.; Samukhin, A. N. (2000-11-20). "Structure of Growing Networks with Preferential Linking". Physical Review Letters 85 (21): 4633–4636. doi:10.1103/physrevlett.85.4633. ISSN 0031-9007. PMID 11082614. Bibcode2000PhRvL..85.4633D. 
  6. Krapivsky, P. L.; Redner, S. (2001-05-24). "Organization of growing random networks". Physical Review E (American Physical Society (APS)) 63 (6): 066123. doi:10.1103/physreve.63.066123. ISSN 1063-651X. PMID 11415189. Bibcode2001PhRvE..63f6123K. 
  7. Dorogovtsev, S. N.; Mendes, J. F. F. (2002). "Evolution of networks". Advances in Physics 51 (4): 1079–1187. doi:10.1080/00018730110112519. ISSN 0001-8732. Bibcode2002AdPhy..51.1079D. 
  8. Krapivsky, P. L. and Redner, S., Rate equation approach for growing networks, in R. Pastor-Satorras and J. Rubi (eds.), Proceedings of the XVIII Sitges Conference on Statistical Mechanics, Lecture Notes in Physics, Springer, Berlin (2003).

Sources