questions as fast as possible and do it well.
Thank you!
Question 5 The objective of this question is to give further practice with lists and writing functions with different types of output. Some additional functions which may be useful: unique() - returns the unique values of a vector, preserving the order in which they occur. %in% - a vectorized predicate function which returns, for each element on the left-hand-side, if it is present in the right-hand-side. (a) Write a function called my_unlist() that inputs a list of vectors x, and combines all vector components together into a single vector without the unlist() function. If x is a factor, the output should be a new factor which combines the levels of all of the factors elements in the list, including any levels which aren't actually present in the factor. For mixed-mode lists (lists with more than one vector type) the output should be of the highest mode hierarchy. my_unlist(list(c(2, 1, 1), c(3, 2, 1), 2)) [1] 2 1 1 3 2 1 2 my_unlist(list(factor(c("a", "a", "b", "c")), factor(c("b", "c", "e")), factor(c("a", "d", "b")))) [1] a a b cb cea db Levels: a b ced Note: If x is a mixed-mode list which includes any factors, the output will still be of the highest mode hierarchy but you must treat all factors as their integer equivalents. See below: my_unlist(list(factor(c("a", "b")), c(1, 2))) [1] 1 2 1 2 my_unlist(list(factor(c("a", "b")), factor(c("b", "c")), c("a", "b"))) (1) "1" "2" "1" "2" "a" "b" Optional: Expand upon this problem to handle lists of lists (i.e., recursive lists). It may be helpful to try a recursive approach. For more guidance on how recursion works, read the “Notes on Recursion” document in the Required Reading on CCLE. my_unlist(list(list(1:3, 1:4), list(list(1:3, 1:3, list(1:4, 1:5))))) [1] 1 2 3 1 2 3 4 1 2 3 1 2 3 1 2 3 4 1 2 3 4 5
(b) The statistical mode of a set of data values is the value or values that appear most often. Using your my_unlist() function from (a), write a function called stat_mode () that returns all of the statistical modes of an input vector or list of vectors x. Include an optional argument first with a default value of FALSE which indicates if only one mode value (the first encountered) should be returned. The returned mode must be of the same type/class as x. For example: stat_mode (c(FALSE, FALSE, TRUE, TRUE, FALSE)) [1] FALSE stat_mode (list(c(2, 1, 1), c(3, 2, 1), 2)) [1] 2 1 stat_mode (list(c(2, 1, 1), c(3, 2, 1), 2), first = TRUE) [1] 2 Note: The first mode is 2 not 1, since the 2 is encountered first. stat_mode (list(factor(c("control", "treatment1", "control")), factor(c("control","treatment2")) ) [1] control Levels: control treatment1 treatment2 Hint: If you are unable to get a working my_unlist() function from (a), you may use the built-in unlist() function. (c) Using the stat_mode() function from (a), write a function called df_summary() that inputs a data frame and outputs a list with the following named components: . n_obs: The number of observations in the data frame. n_var: The number of variables in the data frame. • var_names: A vector of the variable names in the data frame. column_data: A list object, where each list item appears in alphabetical order and is a list object with the name of a column which: 1. contains either the: class: class of the column, min: minimum, mean: mean, and max: maximum, of that variable as well as, na_count: the number of NA values present in the data for that column 2. OR which contains the: - class: class of the column, modes: vector of the statistical modes of that variable, as well as mode_count: the number of times the modal values are each represented in the data frame. For instance, if you had the following data frame:
Homework_One Homework_Two 88 84 93 NA 95 90 99 60 Homework_Three Lecture NA Lecture 1 NA Lecture 1 88 Lecture 1 23 Lecture 2 The structure of your output list (e.g., if you use the str() function on your output list) would look like this: List of 4 $ n_obs : int 4 $ n_var : int 4 $ var_names : chr [1:4] "Homework_One" "Homework_Two" "Homework_Three" "Lecture" $ column_data: List of 4 . . $ Homework_One List of 5 .$ class : chr "numeric" .$ min : num 84 .$ mean : num 88.3 .$ max : num 93 . $ na_count: int 1 . . $ Homework_Three: List of 5 .$ class : chr "numeric" $ min : num 23 $ mean : num 55.5 : num 88 .$ na_count: int 2 . . $ Homework_Two : List of 5 .$ class : chr "numeric" .$ min : num 60 .$ mean : num 86 ..$ max : num 99 .. $ na_count: int 0 .. $ Lecture : List of 3 .$ class : chr "character" .$ modes : chr "Lecture 1" .$ mode_count: int 3 Note: This is the structure of the output, not the output list itself. .$ max (d) Download the starwars. RData file from CCLE and load it into your workspace. Side Note: The starwars. RData is a modified version of the starwars data found in the dplyr package. Do not use the version in dp lyr. Use your df_summary() function from part (c) on the starwars data, and store the result in your workspace. (e) Using only the output object from (d), find the most common starships that the characters in the starwars data have piloted. Do not refer to the original starwars data.
Please solve these Question 5 The objective of this question is to give further practice with lists and writing functions with different ty
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am