In this Module 2 Discussion, we shall discuss how to use R to obtain information by exploring, cleaning, and preprocessi

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899604
Joined: Mon Aug 02, 2021 8:13 am

In this Module 2 Discussion, we shall discuss how to use R to obtain information by exploring, cleaning, and preprocessi

Post by answerhappygod »

In this Module 2 Discussion, we shall discuss how to use R toobtain information by exploring, cleaning, and preprocessing thedata. The following is a kind of checklist of frequent steps indata preparation. More precisely, they are also typical steps in“cleansing” data. Such steps include (at least):
No.
Steps
R functions
1
Loading and looking at the dataset in R
2
Identify missing values
3
Identify outliers
4
Check for overall plausibility and errors (e.g, typos)
5
Identify highly correlated variables
6
Identify variables with (nearly) no variance
7
Identify variables with strange names or values
8
Check variable classes (eg. Characters vs factors)
9
Remove/transform some variables (maybe your model does not likecategorial variables)
10
Rename some variables or values (especially interesting if largenumber)
11
Check some overall pattern (statistical/numericalsummaries/graphical illustrations)
12
Center/scale variables
In view of the above steps, please scan through the threeexamples (Example 1,2,3) in Data Mining and Business Analytics withR Chapter 2 and Data Mining for Business Analytics: Concepts,Techniques, and Applications in R section 2.4 (found in this week'sReading & Resources) to find and then fill in the blanks in theabove table for those R functions we can use to handle these steps,respectively. For example, you may put read.csv() and view() in thefirst row as they are the ways to realize that specific step. Youmay also refer to some open resources to find relevant R functionsto fill in those blanks and each blank can have multiple Rfunctions as answers.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply