[2021-01-11] [CS224W] Properties of Networks and Random Graph Models

2 minute read

Published:

I would like to learn GNN to see if there is any opportunity to optimize/accelerate it. But the surveying paper for GNN has too many expressions, notations which are hard to understand. So I will watch the CS224W Machine Learning with Graphs lecture to learn it in a smooth way. I will write down some key points I learned.

Key Netowrk Properties

Degree distribution P(k): Probability that a randomly chosen node has degree k

Path length h

  • Distance (shortest path, geodesic) between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes
  • Network Diameter: The maximum (shortest path) distance between any pair of nodes in a graph
  • Average path length for a connected graph or a strongly connected directed graph $\bar{h} = 1 / 2 E_{max} \sum_{i,j \neq i} h_ij$, where $h_{i,j}$ is the distance from node $i$ to node $j$, and $E_{max}$ the max number of edges (total number of node pairs) = $n(n-1)/2$

Clustering coefficient C: how connected are i’s neighbors to each other

  • $C_i = 2e_i/(k_i(k_i-1)) \in [0,1]$, where $e_i$ is the number of edges between the neighbors of node $i$.
  • Average clusing coefficient $C = 1/N \sum_i^N C_i$

Connected components s:

  • Size of the largest connected component is Largest set where any two vertices can be joined the by a path
  • Largest component = Giant component
  • Found by BFS

Erdös-Renyi Random Graphs

It is the simplest model of graphs. $Gnp$: undirected graph on n nodes where each edge (u,v) appears i.i.d. with probability p

  • Binomial degree distribution
  • very small clustering coefficient
  • path length and largest connected component is similar to the real network

Small World Model

Goal: high clustering but also short length Two components to the model:

  • Start with a low-dimensional regular lattice - Has high clustering coefficient
  • Rewire: Introduce randomness (“shortcuts”) - reduce diameter Drawback: Does not lead to the correct degree distribution

Kronecker Graph Model

Idea : recursive graph generation to capture the self-similarity of network. Kronecker graph is obtained by growing sequence of graphs by iterating the Kronecker product over the initiator matrix $K_1$: \(K_1^{[m]} = K_m = \underset{m \; times}{K1 \otimes K1 \otimes \cdots K_1} = K_{m-1} \otimes K_1\) The Kronecker graph and real network looks very close.