Katz centrality

From HandWiki
Short description: Measure of centrality in a network based on nodal influence

In graph theory, the Katz centrality or alpha centrality of a node is a measure of centrality in a network. It was introduced by Leo Katz in 1953 and is used to measure the relative degree of influence of an actor (or node) within a social network.[1] Unlike typical centrality measures which consider only the shortest path (the geodesic) between a pair of actors, Katz centrality measures influence by taking into account the total number of walks between a pair of actors.[2]

It is similar to Google's PageRank and to the eigenvector centrality.[3]

Measurement

A simple social network: the nodes represent people or actors and the edges between nodes represent some relationship between actors

Katz centrality computes the relative influence of a node within a network by measuring the number of the immediate neighbors (first degree nodes) and also all other nodes in the network that connect to the node under consideration through these immediate neighbors. Connections made with distant neighbors are, however, penalized by an attenuation factor [math]\displaystyle{ \alpha }[/math].[4] Each path or connection between a pair of nodes is assigned a weight determined by [math]\displaystyle{ \alpha }[/math] and the distance between nodes as [math]\displaystyle{ \alpha^d }[/math].

For example, in the figure on the right, assume that John's centrality is being measured and that [math]\displaystyle{ \alpha = 0.5 }[/math]. The weight assigned to each link that connects John with his immediate neighbors Jane and Bob will be [math]\displaystyle{ (0.5)^1 = 0.5 }[/math]. Since Jose connects to John indirectly through Bob, the weight assigned to this connection (composed of two links) will be [math]\displaystyle{ (0.5)^2 = 0.25 }[/math]. Similarly, the weight assigned to the connection between Agneta and John through Aziz and Jane will be [math]\displaystyle{ (0.5)^3 = 0.125 }[/math] and the weight assigned to the connection between Agneta and John through Diego, Jose and Bob will be [math]\displaystyle{ (0.5)^4 = 0.0625 }[/math].

Mathematical formulation

Let A be the adjacency matrix of a network under consideration. Elements [math]\displaystyle{ (a_{ij}) }[/math] of A are variables that take a value 1 if a node i is connected to node j and 0 otherwise. The powers of A indicate the presence (or absence) of links between two nodes through intermediaries. For instance, in matrix [math]\displaystyle{ A^3 }[/math], if element [math]\displaystyle{ (a_{2,12}) = 1 }[/math], it indicates that node 2 and node 12 are connected through some walk of length 3. If [math]\displaystyle{ C_{\mathrm{Katz}}(i) }[/math] denotes Katz centrality of a node i, then, given a value [math]\displaystyle{ \alpha\in(0,1) }[/math], mathematically:

[math]\displaystyle{ C_{\mathrm{Katz}}(i) = \sum_{k=1}^\infty \sum_{j=1}^n \alpha^k (A^k)_{ji} }[/math]

Note that the above definition uses the fact that the element at location [math]\displaystyle{ (i,j) }[/math] of [math]\displaystyle{ A^k }[/math] reflects the total number of [math]\displaystyle{ k }[/math] degree connections between nodes [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math]. The value of the attenuation factor [math]\displaystyle{ \alpha }[/math] has to be chosen such that it is smaller than the reciprocal of the absolute value of the largest eigenvalue of A.[5] In this case the following expression can be used to calculate Katz centrality:

[math]\displaystyle{ \overrightarrow{C}_{\mathrm{Katz}} = ((I - \alpha A^T)^{-1}-I)\overrightarrow{I} }[/math]

Here [math]\displaystyle{ I }[/math] is the identity matrix, [math]\displaystyle{ \overrightarrow{I} }[/math] is a vector of size n (n is the number of nodes) consisting of ones. [math]\displaystyle{ A^T }[/math] denotes the transposed matrix of A and [math]\displaystyle{ (I - \alpha A^T)^{-1} }[/math] denotes matrix inversion of the term [math]\displaystyle{ (I - \alpha A^T) }[/math].[5]

An extension of this framework allows for the walks to be computed in a dynamical setting.[6][7] By taking a time dependent series of network adjacency snapshots of the transient edges, the dependency for walks to contribute towards a cumulative effect is presented. The arrow of time is preserved so that the contribution of activity is asymmetric in the direction of information propagation.

Network producing data of the form:

[math]\displaystyle{ \left \{A^{[k]} \in \R^{N \times N} \right \} \qquad \text{for} \quad k=0,1,2,\ldots,M, }[/math]

representing the adjacency matrix at each time [math]\displaystyle{ t_k }[/math]. Hence:

[math]\displaystyle{ \left( A^{[k]} \right)_{ij} = \begin{cases} 1 & \text{there is an edge from node } i \text{ to node } j \text{ at time } t_k \\ 0 & \text{otherwise} \end{cases} }[/math]

The time points [math]\displaystyle{ t_0 \lt t_1 \lt \cdots \lt t_M }[/math] are ordered but not necessarily equally spaced. [math]\displaystyle{ Q \in \R^{N \times N} }[/math] for which [math]\displaystyle{ (Q)_{ij} }[/math] is a weighted count of the number of dynamic walks of length [math]\displaystyle{ w }[/math] from node [math]\displaystyle{ i }[/math] to node [math]\displaystyle{ j }[/math]. The form for the dynamic communicability between participating nodes is:

[math]\displaystyle{ \mathcal{Q} = \left(I-\alpha A^{[0]} \right)^{-1} \cdots \left( I - \alpha A^{[M]} \right)^{-1}. }[/math]

This can be normalized via:

[math]\displaystyle{ \hat{\mathcal{Q}}^{[k]} = \frac{\hat{\mathcal{Q}}^{[k-1]} \left(I-\alpha A^{[k]} \right)^{-1}}{\left \|\hat{\mathcal{Q}}^{[k-1]} \left( I - \alpha A^{[k]} \right)^{-1} \right \|}. }[/math]

Therefore, centrality measures that quantify how effectively node [math]\displaystyle{ n }[/math] can 'broadcast' and 'receive' dynamic messages across the network:

[math]\displaystyle{ C_n^{\mathrm{broadcast}} := \sum_{k=1}^{N} \mathcal{Q}_{nk} \quad \mathrm{and} \quad C_n^{\mathrm{receive}} := \sum_{k=1}^{N} \mathcal{Q}_{kn} }[/math].

Alpha centrality

Given a graph with adjacency matrix [math]\displaystyle{ A_{i,j} }[/math], Katz centrality is defined as follows:

[math]\displaystyle{ \vec{x} = (I-\alpha A^T)^{-1}\vec{e} - \vec{e} \, }[/math]

where [math]\displaystyle{ e_j }[/math] is the external importance given to node [math]\displaystyle{ j }[/math], and [math]\displaystyle{ \alpha }[/math] is a nonnegative attenuation factor which must be smaller than the inverse of the spectral radius of [math]\displaystyle{ A }[/math]. The original definition by Katz [8] used a constant vector [math]\displaystyle{ \vec{e} }[/math]. Hubbell[9] introduced the usage of a general [math]\displaystyle{ \vec{e} }[/math].

Half a century later, Bonacich and Lloyd[10] defined alpha centrality as:

[math]\displaystyle{ \vec{x} = (I-\alpha A^T)^{-1}\vec{e} \, }[/math]

which is essentially identical to Katz centrality. More precisely, the score of a node [math]\displaystyle{ j }[/math] differs exactly by [math]\displaystyle{ e_j }[/math], so if [math]\displaystyle{ \vec{e} }[/math] is constant the order induced on the nodes is identical.

Applications

Katz centrality can be used to compute centrality in directed networks such as citation networks and the World Wide Web.[11]

Katz centrality is more suitable in the analysis of directed acyclic graphs where traditionally used measures like eigenvector centrality are rendered useless.[11]

Katz centrality can also be used in estimating the relative status or influence of actors in a social network. The work presented in [12] shows the case study of applying a dynamic version of the Katz centrality to data from Twitter and focuses on particular brands which have stable discussion leaders. The application allows for a comparison of the methodology with that of human experts in the field and how the results are in agreement with a panel of social media experts.

In neuroscience, it is found that Katz centrality correlates with the relative firing rate of neurons in a neural network.[13] The temporal extension of the Katz centrality is applied to fMRI data obtained from a musical learning experiment in [14] where data is collected from the subjects before and after the learning process. The results show that the changes to the network structure over the musical exposure created in each session a quantification of the cross communicability that produced clusters in line with the success of learning.

A generalized form of Katz centrality can be used as an intuitive ranking system for sports teams, such as in college football.[15]

Alpha centrality is implemented in igraph library for network analysis and visualization.[16]

References

  1. Katz, L. (1953). A New Status Index Derived from Sociometric Analysis. Psychometrika, 39–43.
  2. Hanneman, R. A., & Riddle, M. (2005). Introduction to Social Network Methods. Retrieved from http://faculty.ucr.edu/~hanneman/nettext/
  3. Vigna, S. (2016). "Spectral ranking". Network Science 4 (4): 433–445. doi:10.1017/nws.2016.21. 
  4. Aggarwal, C. C. (2011). Social Network Data Analysis. New York, NY: Springer.
  5. 5.0 5.1 Junker, B. H., & Schreiber, F. (2008). Analysis of Biological Networks. Hoboken, NJ: John Wiley & Sons.
  6. Grindrod, Peter; Parsons, Mark C; Higham, Desmond J; Estrada, Ernesto (2011). "Communicability across evolving networks". Physical Review E (APS) 83 (4): 046120. doi:10.1103/PhysRevE.83.046120. PMID 21599253. Bibcode2011PhRvE..83d6120G. http://centaur.reading.ac.uk/19357/1/Coomunicability_accepted.pdf. 
  7. Peter Grindrod; Desmond J. Higham. (2010). "Evolving graphs: Dynamical models, inverse problems and propagation". Proc. R. Soc. A 466 (2115): 753–770. doi:10.1098/rspa.2009.0456. Bibcode2010RSPSA.466..753G. 
  8. Leo Katz (1953). "A new status index derived from sociometric analysis". Psychometrika 18 (1): 39–43. doi:10.1007/BF02289026. 
  9. Charles H. Hubbell (1965). "An input-output approach to clique identification". Sociometry 28 (4): 377–399. doi:10.2307/2785990. 
  10. P. Bonacich, P. Lloyd (2001). "Eigenvector-like measures of centrality for asymmetric relations". Social Networks 23 (3): 191–201. doi:10.1016/S0378-8733(01)00038-7. 
  11. 11.0 11.1 Newman, M. E. (2010). Networks: An Introduction. New York, NY: Oxford University Press.
  12. Laflin, Peter; Mantzaris, Alexander V; Ainley, Fiona; Otley, Amanda; Grindrod, Peter; Higham, Desmond J (2013). "Discovering and validating influence in a dynamic online social network". Social Network Analysis and Mining (Springer) 3 (4): 1311–1323. doi:10.1007/s13278-013-0143-7. 
  13. Fletcher, Jack McKay; Wennekers, Thomas (2017). "From Structure to Activity: Using Centrality Measures to Predict Neuronal Activity". International Journal of Neural Systems 28 (2): 1750013. doi:10.1142/S0129065717500137. PMID 28076982. 
  14. Mantzaris, Alexander V.; Danielle S. Bassett; Nicholas F. Wymbs; Ernesto Estrada; Mason A. Porter; Peter J. Mucha; Scott T. Grafton; Desmond J. Higham (2013). "Dynamic network centrality summarizes learning in the human brain". Journal of Complex Networks 1 (1): 83–92. doi:10.1093/comnet/cnt001. 
  15. Park, Juyong; Newman, M. E. J. (31 October 2005). "A network-based ranking system for American college football". Journal of Statistical Mechanics: Theory and Experiment 2005 (10): P10014. doi:10.1088/1742-5468/2005/10/P10014. ISSN 1742-5468. 
  16. "Welcome to igraph's new home". http://igraph.sourceforge.net/doc/R/alpha.centrality.html.