Use python Jupyter for code: download dataset here: https://www.kaggle.com/datasets/saurabh00007/diabetescsv
Posted: Fri Jul 01, 2022 5:42 am
Use python Jupyter for code: download dataset here: https://www.kaggle.com/datasets/saurabh ... iabetescsv
Use the same data sets as question 2 (diabetes.csv). You may use any version of the clear dataset that you have created from question 2 1. Create a new DataFrame by sampling the dataset. Sample the dataset to randomly select 30% of the data. 2. Visualize each attribute as scatter plots. Show each attribute along the x-axis and the 'OutCome' along the y-axis. Hint: you must use Seaborn and matploylib figures Use matplotlib to create the following statistical graphics / figures 3.1. Histegram of Glucose 3.2. Create a Scatterplot of 'Outcome' against 'SkinThickness' 4. Split the data into training and testing data and create a linear regression model. Hint: 'Outcome' is the feature that we want to learn 5. Test the model. What is the accuracy of your model 6. Reduce the dimensionality of the data in to two and visualize the data by creating a colored scatter plot. 7. Cluster the data by using k-means clustering
Use the same data sets as question 2 (diabetes.csv). You may use any version of the clear dataset that you have created from question 2 1. Create a new DataFrame by sampling the dataset. Sample the dataset to randomly select 30% of the data. 2. Visualize each attribute as scatter plots. Show each attribute along the x-axis and the 'OutCome' along the y-axis. Hint: you must use Seaborn and matploylib figures Use matplotlib to create the following statistical graphics / figures 3.1. Histegram of Glucose 3.2. Create a Scatterplot of 'Outcome' against 'SkinThickness' 4. Split the data into training and testing data and create a linear regression model. Hint: 'Outcome' is the feature that we want to learn 5. Test the model. What is the accuracy of your model 6. Reduce the dimensionality of the data in to two and visualize the data by creating a colored scatter plot. 7. Cluster the data by using k-means clustering