[2021-01-13] [CS224W] Community Structure in Networks

1 minute read

Published:

Communities

Triadic closure = high clustering coefficient Edge overleap: \(O_{i,j} = \frac{N(i) \cap N(j)\\{i,j}}{N(i) \cup N(j)\\{i,j}}\) where N(i) is the set of neighbors of node i

Network communities: sets of tightly connected nodes

Modularity Q: A measure of how well a network is partitioned into communities \(Q \propto \sum_{s \in S} [(# edges within group s) - \underset{need a null model}{(expected # edges within group s)} ]\)

Null Model - Configuration Model Given real $G$ on $n$ nodes and $m$ edges, construct rewired network $G\prime$ The expected number of edges between nodes $i$ and $j$ of degrees $k_i$ and $k_j$ is $k_ik_j/(2m)$

\(Q(G,S) = 1/2m \sum_{s \in S}\sum_{i \in s} \sum_{j \in s} (A_{ij} - k_ik_j/(2m))\) $A_{ij} = 1$ if i,j has edges

Louvain Algorithm - Greedy algorithm for community detection

O($n\log n$) run time Each pass is made of 2 phases: Phase 1: Modularity is optimized by allowing only local changes to node-communities memberships Phase 2: The identified communities are aggregated into super-nodes to build a new network

BigCLAM - Detecting Overlapping Communities

Step 1) Define a generative model for graphs that is based on node community affiliations Community Affiliation Graph Model (AGM) Step 2) Given graph $G$, make the assumption that $G$ was generated by AGM Find the best AGM that could have generated $G$, maximize graph likelihood $P(G|F)$