Relaxation of Ideal Clustering The optimization problem for a given 7₁, 7₂ C = min s¹ Ls, such that Σ 3kn₁ N2, s={-1,1}"
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am
Relaxation of Ideal Clustering The optimization problem for a given 7₁, 7₂ C = min s¹ Ls, such that Σ 3kn₁ N2, s={-1,1}"
Derivation x Lx = ij=1 Lijxjxj Σ(D;; - Aij) ziz; ij=1 = -ΣΟ - ΣΑΣ, M m Dim + EDx - ΣAigia i,j=1 -Σ ΣΑ» (w? + a3 – 2mix;) ἐj=1 -Σ Σ Aj (m; - w;), 1 where we use the fact that D = Σ–1 Aij, and D;; = 2-1 Α
Minimizing the above would lead to a solution x that can be interpreted as follows: In the original ideal clustering setup, the multiplier to Aij is equal to 0 when two nodes i, j belong to the same cluster. The same way, we can treat any two nodes i, j whose , j values are close as being in the same cluster. And, if the ;, ; values are far apart we can classify them into different clusters. Beyond this intuitive understanding of the relaxed problem, the relaxed problem also has important properties. First the graph Laplacian I is a symmetric matrix. Then, using the above expansion we can clearly see that if A ≥ 0, Vi, j (which is the likely scenario in most applications) L is a positive semi- definite matrix. For a positive semidefinite matrix, the eigenvalues are nonnegative. In particular, for the Laplacian the smallest eigenvalue is equal to 0. This can be seen from the fact that L1 = 0, where 1 is a vector of ones. Self-exercise: It can also be shown that the multiplicity of the zero eigenvalue is the number of connected components of the simple, undirected graph. Getting back to the optimization problem Ĉ = min x Lx, ||x|| = 1, XER" we now know that the optimal value of this problem is equal to 0 since there is an eigenvector (which can be normalized) with 0 eigenvalue. This solution is not satisfying for our clustering since the eigenvector(s) corresponding to the zero eigenvalue(s) pick out essentially the connected components of the graph. In particular, if there is only one component in the graph then 1 is the optimal vector that obtains the optimal cost of C = 0 and this vector provides no information as to how to pick out different clusters in the graph. Let 0 = A1₁ ≤A₂ ≤...<A, be the n eigenvalues of L and let  be the second smallest eigenvalue or the smallest non-zero eigenvalue of L. If there is only one connected component in the graph, this value is also equal to X₂. The eigenvector corresponding to this eigenvalue, denoted *, turns out to be a good heuristic vector to cluster the graph according to the following rule: nodes whose corresponding values are close to each other can be assigned a cluster. We can use this procedure to obtain any number of clusters that we wish to in the graph. With the cost of the relaxed clustering problem becomes xLx.
Spectral Clustering - I 8 points possible (graded) Consider the kite graph shown in the figure. they 7: A Kite Graph. Compute the eigenvector corresponding to the second smallest eigenvalue. You may use any computational tool at your disposal. We recommend networkx. linalg.algebraicconnectivity.fiedler_vector. Ensure your eigenvector is normalized to = 1. Round your values to three Σi v ² significant figures.
Node 1: Node 2: Node 3: Node 4: Node 5: Node 6: Node 7: Node 8:
Spectral Clustering - II 2 points possible (graded) Answer the following questions based on the eigenvector computed above: 1. If we clustered the kite graph into two clusters each of 4 nodes, which of the following would be in the cluster with node 1? 0 2 3 4 5 6 O 7 8
2. If we instead had only 3 nodes in the cluster that contained node 1, which of the following nodes will make it into that cluster? O 0 2 3 4 5 6 7 8