[2021-01-15] [CS224W] Spectral Clustering

1 minute read


Graph Partitioning

Graph cut: Set of edges with one endpoint in each group \(cut(A,B) = \sum_{i \in A, j \in B} w_{ij}\) where $w_{ij}$ is the weighted edges between i and j

Graph Cut Criterion:

  • Minumin cut
    • problems: only consider external cluster connections
  • Conductance
    • $\phi(A,B) = \frac{cut(A,B)}{min(vol(A), vol(B))}$, where $vol(A)$ is the total weighted degreee of nodes in A
    • Produces more balanced partitions
    • problem: Computing the best cut is NP-hard

Adjacency matrix(A) Degree Matrix(D) Laplacian matrix(L): $L = D - A$

We would like to find the 2nd smallest eigenvalues and eigenvectors of $L$ \(\lambda_2 = min_{x: x^Tw_1 = 0} \frac{x^TMx}{x^Tx} = min_{\sumx_i=0} \frac{\sum_{(i,j) \in E} (x_i - x_j)^2}{\sum_i x_i^2}\)

Spectral Clustering Algorithm

1) Pre-processing Construct a matrix representation of the graph 2) Decomposition Compute eigenvalues and eigenvectors of the matrix (only care the 2nd smallest eigenvalues) Map each point to a lower-dimensional representation based on one or more eigenvectors 3) Grouping Assign points to two or more clusters, based on the new representation

Motif-based spectral clustering

motifs cut $vol_M(S)$ = #(motif end-points in S) \(\phi(S) = \frac{#(motifs cut)}{vol_M(S)}\)

Three steps 1) Pre-processing $W_{i,j}^(M)$= # times edge (i,j) participates in the motif $M$ 2) Decomposition (standard sprctral clustering) set $L^(M) = D^(M) - W^(M)$, get 2nd eigenvalues and eigenvectors 3) Grouping Sort nodes by their values in $x$: x1, x2, …xn. Let Sr = {x1, …, xr} and compute the motif conductance of each $S_r$