Problem 2. (25 points) In this assignment, you will explore PCA as a technique for dis- cerning whether low-dimensional
-
answerhappygod
- Site Admin
- Posts: 899604
- Joined: Mon Aug 02, 2021 8:13 am
Problem 2. (25 points) In this assignment, you will explore PCA as a technique for dis- cerning whether low-dimensional
Problem 2. (25 points) In this assignment, you will explore PCA as a technique for dis- cerning whether low-dimensional structure exists in a set of data and for finding good repre- sentations of the data in that subspace. To this end, you will do PCA on Iris dataset which can be loaded in scikit-learn using following commands: from sklearn.datasets import load_iris iris = load_iris() X iris.data у iris. target a) Carry out a principal component analysis on the entire raw dataset (4-dimensional in- stances) for k = 1, 2, 3, 4 components. How much of variance in data can be explained by the first principal component? How does the fraction of variation explained in data vary as k varies? b) Apply the standardization operations from Problem 1 on raw features and repeat part (a) on processed data. Explain any differences you observe compared to part (a) and justify. c) Project the raw four dimensional data down to a two dimensional subspace generated by first two top principle components (PCs) from part (b) and produce a scatter plot of the data. Make sure you plot each of the three classes differently (using color or different markers). Can you see the three Iris flower clusters? d) Either use your k-means++ implementation from previous homework or from scikit-learn to cluster data from part (c) into three clusters. Explain your observations.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!