In Rstudio, you will generate simulatted data and then perform PCA and k-means clustering on the data. First run the fol

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

In Rstudio, you will generate simulatted data and then perform PCA and k-means clustering on the data. First run the fol

Post by answerhappygod »

In Rstudio, you will generate simulatted data and then perform
PCA and k-means clustering on the data. First run the following to
obtain the data.
library(mvtnorm)
n <- 20
p <- 10
x <- rmvnorm(n*3, rep(0, p))
# shift means
x[seq_len(n), ] <- x[seq_len(n), ] +
matrix(rep(runif(p, min = 1, max = 3), n), nrow = n, byrow =
TRUE)
x[seq_len(n) + 2*n, ] <- x[seq_len(n) + 2*n, ] +
matrix(rep(runif(p, min = -3, max = -1), n), nrow = n, byrow =
TRUE)
# add class labels
y <- c(rep("-1", n), rep("0", n), rep("1",
n))
a) Perform PCA on the 60 observations and plot the first
two principal comonent score vectors. Use a different color to
indicate the observations in each of the true classes (`y`).
b) Perform K means clustering of the observations with K =
3. How well do the clusters you obtained in k-means clustering
compare to the true class labels? (**Hint:** `table()` may be
useful here.)
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply