Soft configuration model

From HandWiki
Short description: Random graph model in applied mathematics

In applied mathematics, the soft configuration model (SCM) is a random graph model subject to the principle of maximum entropy under constraints on the expectation of the degree sequence of sampled graphs.[1] Whereas the configuration model (CM) uniformly samples random graphs of a specific degree sequence, the SCM only retains the specified degree sequence on average over all network realizations; in this sense the SCM has very relaxed constraints relative to those of the CM ("soft" rather than "sharp" constraints[2]). The SCM for graphs of size [math]\displaystyle{ n }[/math] has a nonzero probability of sampling any graph of size [math]\displaystyle{ n }[/math], whereas the CM is restricted to only graphs having precisely the prescribed connectivity structure.

Model formulation

The SCM is a statistical ensemble of random graphs [math]\displaystyle{ G }[/math] having [math]\displaystyle{ n }[/math] vertices ([math]\displaystyle{ n=|V(G)| }[/math]) labeled [math]\displaystyle{ \{v_j\}_{j=1}^n=V(G) }[/math], producing a probability distribution on [math]\displaystyle{ \mathcal{G}_n }[/math] (the set of graphs of size [math]\displaystyle{ n }[/math]). Imposed on the ensemble are [math]\displaystyle{ n }[/math] constraints, namely that the ensemble average of the degree [math]\displaystyle{ k_j }[/math] of vertex [math]\displaystyle{ v_j }[/math] is equal to a designated value [math]\displaystyle{ \widehat{k}_j }[/math], for all [math]\displaystyle{ v_j\in V(G) }[/math]. The model is fully parameterized by its size [math]\displaystyle{ n }[/math] and expected degree sequence [math]\displaystyle{ \{\widehat{k}_j\}_{j=1}^n }[/math]. These constraints are both local (one constraint associated with each vertex) and soft (constraints on the ensemble average of certain observable quantities), and thus yields a canonical ensemble with an extensive number of constraints.[2] The conditions [math]\displaystyle{ \langle k_j \rangle = \widehat{k}_j }[/math] are imposed on the ensemble by the method of Lagrange multipliers (see Maximum-entropy random graph model).

Derivation of the probability distribution

The probability [math]\displaystyle{ \mathbb{P}_\text{SCM}(G) }[/math] of the SCM producing a graph [math]\displaystyle{ G }[/math] is determined by maximizing the Gibbs entropy [math]\displaystyle{ S[G] }[/math] subject to constraints [math]\displaystyle{ \langle k_j \rangle = \widehat{k}_j, \ j=1,\ldots,n }[/math] and normalization [math]\displaystyle{ \sum_{G\in \mathcal{G}_n}\mathbb{P}_\text{SCM}(G)=1 }[/math]. This amounts to optimizing the multi-constraint Lagrange function below:

[math]\displaystyle{ \begin{align} & \mathcal{L}\left(\alpha,\{\psi_j\}_{j=1}^n\right) \\[6pt] = {} & -\sum_{G\in\mathcal{G}_n}\mathbb{P}_\text{SCM}(G)\log\mathbb{P}_\text{SCM}(G) + \alpha\left(1-\sum_{G\in \mathcal{G}_n}\mathbb{P}_\text{SCM}(G) \right)+\sum_{j=1}^n\psi_j\left(\widehat{k}_j-\sum_{G\in\mathcal{G}_n}\mathbb{P}_\text{SCM}(G)k_j(G)\right), \end{align} }[/math]

where [math]\displaystyle{ \alpha }[/math] and [math]\displaystyle{ \{\psi_j\}_{j=1}^n }[/math] are the [math]\displaystyle{ n+1 }[/math] multipliers to be fixed by the [math]\displaystyle{ n+1 }[/math] constraints (normalization and the expected degree sequence). Setting to zero the derivative of the above with respect to [math]\displaystyle{ \mathbb{P}_\text{SCM}(G) }[/math] for an arbitrary [math]\displaystyle{ G\in \mathcal{G}_n }[/math] yields

[math]\displaystyle{ 0 = \frac{\partial \mathcal{L}\left(\alpha,\{\psi_j\}_{j=1}^n\right)}{\partial \mathbb{P}_\text{SCM}(G)}= -\log \mathbb{P}_\text{SCM}(G) -1-\alpha-\sum_{j=1}^n\psi_j k_j(G) \ \Rightarrow \ \mathbb{P}_\text{SCM}(G)=\frac{1}{Z}\exp\left[-\sum_{j=1}^n\psi_jk_j(G)\right], }[/math]

the constant [math]\displaystyle{ Z:=e^{\alpha+1}=\sum_{G\in\mathcal{G}_n}\exp\left[-\sum_{j=1}^n\psi_jk_j(G)\right]=\prod_{1\le i \lt j \le n}\left(1+e^{-(\psi_i+\psi_j)}\right) }[/math][3] being the partition function normalizing the distribution; the above exponential expression applies to all [math]\displaystyle{ G\in\mathcal{G}_n }[/math], and thus is the probability distribution. Hence we have an exponential family parameterized by [math]\displaystyle{ \{\psi_j\}_{j=1}^n }[/math], which are related to the expected degree sequence [math]\displaystyle{ \{\widehat{k}_j\}_{j=1}^n }[/math] by the following equivalent expressions:

[math]\displaystyle{ \langle k_q \rangle = \sum_{G\in \mathcal{G}_n}k_q(G)\mathbb{P}_\text{SCM}(G) = -\frac{\partial \log Z}{\partial \psi_q} =\sum_{j\ne q}\frac{1}{e^{\psi_q+\psi_j}+1} = \widehat{k}_q, \ q=1,\ldots,n. }[/math]

References

  1. van der Hoorn, Pim; Gabor Lippner; Dmitri Krioukov (2017-10-10). "Sparse Maximum-Entropy Random Graphs with a Given Power-Law Degree Distribution". 
  2. 2.0 2.1 Garlaschelli, Diego; Frank den Hollander; Andrea Roccaverde (January 30, 2018). "Coviariance structure behind breaking of ensemble equivalence in random graphs". http://eprints.imtlucca.it/4040/1/1711.04273.pdf. 
  3. Park, Juyong; M.E.J. Newman (2004-05-25). "The statistical mechanics of networks".