[2021-01-13] [CS224W] Community Structure in Networks
Published:
Communities
Triadic closure = high clustering coefficient Edge overleap: \(O_{i,j} = \frac{N(i) \cap N(j)\\{i,j}}{N(i) \cup N(j)\\{i,j}}\) where N(i) is the set of neighbors of node i
Network communities: sets of tightly connected nodes
Modularity Q: A measure of how well a network is partitioned into communities \(Q \propto \sum_{s \in S} [(# edges within group s) - \underset{need a null model}{(expected # edges within group s)} ]\)
Null Model - Configuration Model Given real $G$ on $n$ nodes and $m$ edges, construct rewired network $G\prime$ The expected number of edges between nodes $i$ and $j$ of degrees $k_i$ and $k_j$ is $k_ik_j/(2m)$
\(Q(G,S) = 1/2m \sum_{s \in S}\sum_{i \in s} \sum_{j \in s} (A_{ij} - k_ik_j/(2m))\) $A_{ij} = 1$ if i,j has edges
Louvain Algorithm - Greedy algorithm for community detection
O($n\log n$) run time Each pass is made of 2 phases: Phase 1: Modularity is optimized by allowing only local changes to node-communities memberships Phase 2: The identified communities are aggregated into super-nodes to build a new network
BigCLAM - Detecting Overlapping Communities
Step 1) Define a generative model for graphs that is based on node community affiliations Community Affiliation Graph Model (AGM) Step 2) Given graph $G$, make the assumption that $G$ was generated by AGM Find the best AGM that could have generated $G$, maximize graph likelihood $P(G|F)$