[2021-01-11] [CS224W] Properties of Networks and Random Graph Models
Published:
I would like to learn GNN to see if there is any opportunity to optimize/accelerate it. But the surveying paper for GNN has too many expressions, notations which are hard to understand. So I will watch the CS224W Machine Learning with Graphs lecture to learn it in a smooth way. I will write down some key points I learned.
Key Netowrk Properties
Degree distribution P(k): Probability that a randomly chosen node has degree k
Path length h
- Distance (shortest path, geodesic) between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes
- Network Diameter: The maximum (shortest path) distance between any pair of nodes in a graph
- Average path length for a connected graph or a strongly connected directed graph $\bar{h} = 1 / 2 E_{max} \sum_{i,j \neq i} h_ij$, where $h_{i,j}$ is the distance from node $i$ to node $j$, and $E_{max}$ the max number of edges (total number of node pairs) = $n(n-1)/2$
Clustering coefficient C: how connected are i’s neighbors to each other
- $C_i = 2e_i/(k_i(k_i-1)) \in [0,1]$, where $e_i$ is the number of edges between the neighbors of node $i$.
- Average clusing coefficient $C = 1/N \sum_i^N C_i$
Connected components s:
- Size of the largest connected component is Largest set where any two vertices can be joined the by a path
- Largest component = Giant component
- Found by BFS
Erdös-Renyi Random Graphs
It is the simplest model of graphs. $Gnp$: undirected graph on n nodes where each edge (u,v) appears i.i.d. with probability p
- Binomial degree distribution
- very small clustering coefficient
- path length and largest connected component is similar to the real network
Small World Model
Goal: high clustering but also short length Two components to the model:
- Start with a low-dimensional regular lattice - Has high clustering coefficient
- Rewire: Introduce randomness (“shortcuts”) - reduce diameter Drawback: Does not lead to the correct degree distribution
Kronecker Graph Model
Idea : recursive graph generation to capture the self-similarity of network. Kronecker graph is obtained by growing sequence of graphs by iterating the Kronecker product over the initiator matrix $K_1$: \(K_1^{[m]} = K_m = \underset{m \; times}{K1 \otimes K1 \otimes \cdots K_1} = K_{m-1} \otimes K_1\) The Kronecker graph and real network looks very close.