In this question, we will simulate new data and explore linear and quadratic discriminant analysis (LDA and QDA) techniq

Post by **answerhappygod** » Fri Apr 29, 2022 10:41 am

In this question, we will simulate new data and explore linear
and quadratic discriminant analysis (LDA and QDA) techniques. In
particular, we will us p = 2 features, K = 3 classes, and n = 50 X
K = 150 total data points (50 per class).
(a) (1 pt) Fix Mi = [-3, 2], M2 = [1,0], 43 = [5, 2]. And fix
the covariance matrices Σ1 = Σ3 = 0 2 Σ2 = 4 3 3 5 Generate 50 data
points for each class k = 1, 2, 3 that follow a distribution N(uk,
Ek) using the mvrnorm function from the MASS package. Make a single
plot that shows a scatter plot of all 150 data points color coded
by class. Scale your axes appropriately.
(b) (1 pt) Fit a linear discriminant analysis model (LDA).
Report the training error and 3 x 3 confusion matrix.
(c) (1 pt) Fit a quadratic discriminant analysis model (QDA).
Report the training error and 3 x 3 confusion matrix. Compare with
the results in part (b).
(d) (2 pt) Generate n = 500 test observations for each class
using the same procedure in part (a). Report the test error using
LDA. Also report the test error using QDA. Compare these with the
training errors. Which model do you prefer?
(e) (2 pt) What is the key difference between LDA and QDA? Which
of the two methods has more parameters to estimate? Based on the
number of parameters each model estimates, which model is simpler
and which model is more generalizable?
(f) (2 pt) Based on the true data generation process in part (a)
and your answer to part (e), would you prefer LDA or QDA? Reconcile
any discrepancies with your answer in part (d).