Please answer these in a Python program in Google Colab.
1. (5 marks) Load the "avocado_data.csv" file in to a pandas DataFrame. The response/target variable is contained in the ‘Price' column, and all other columns are predictors/features. Extract predictors and responses making sure that you include only the columns with numerical values. Scale predictors to have 0 mean and unit variance. Split your data into training and test sets. 2. (5 marks) Plot some predictors versus the price in a way you find the most convenient. Which predictors do you think will be most important? 3. (5 marks) Fit a standard multilinear regression model which uses all the predictors/features. Estimate the R2 and MSE values of your model. 4. (5 marks) Use Lasso regression to create a model which uses only four features. What is the R2 of this simpler model? 5. (10 marks) Open ended question: Using any method you wish, build a avocado price predictor with the best possible predictive power. Credit will be given for for clear coding and comments, creative and rigourous use of methods, and quality of predictions on the test.
Please answer these in a Python program in Google Colab.
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am