Page 1 of 1

can someone please do this asap ? Due Thursday, February 24 By deleting the "skin" and "insulin" attributes, which had a

Posted: Fri Mar 04, 2022 10:26 am
by answerhappygod
can someone please do this asap ?
Due Thursday, February 24
By deleting the "skin" and "insulin" attributes, which had an
excessive amount of missing values, we were able to slightly
improve our results on all the algorithms we tried: naive Bayes,
Bayes net, logistic regression, and J48 decision trees.
For homework 2, try to predict the missing values of skin and
insulin, as well as any missing values in other attributes (0s in
plasma, pressure, and mass are missing values), and compare your
results with what you got by just deleting the features. Try using
mean/median first, which should be simple, then try using linear
regression to estimate the values.
Since Weka doesn't support filling in missing values with
formulas, you'll have to enter the missing values some other way,
such as with a spreadsheet. You can copy the values from the .arff
file to a spreadsheet, converting from csv format, do whatever work
you need in the spreadsheet, then export as a csv. Weka can open
CSV files.
You'll also have to come up with your own solution for dealing
with data points with multiple missing values, since the iterative
solution I went over in class would be really hard and awkward to
implement using Weka and a spreadsheet. Hint: it may help to have
two copies of your dataset side by side in your spreadsheet to
compute your changes based on the unchanged original. This will
help you avoid circular formulas.
Also, explore whether there's evidence that the missing values
of skin and insulin are missing at random or not.
You should go through this process by class on Tuesday, filling
in the missing values with some prediction and making sure you can
get your modified data back into Weka and use it for your overall
prediction.
We'll discuss your results on Tuesday, then you'll submit a 1-2
page write-up of the approaches you tried and results you got,
along with a screenshot of your final results, on Gradescope by 5pm
Thursday.