A data set has 5,000 labeled examples. If you partition it into training and testing data using 5-fold cross-validation,
Posted: Fri May 20, 2022 2:53 pm
A data set has 5,000 labeled examples. If you partition it into
training and testing data using 5-fold cross-validation, each
training set will contain (i)________ examples, each test set will
contain (ii)_______ examples and a total of
(iii)_________models will be generated.
(i) 1000, (ii) 4000, (iii) 10
(i) 1000, (ii) 4000, (iii) 5
(i) 4000, (ii) 1000, (iii) 5
(i) 4000, (ii) 1000, (iii) 10
Which of the following algorithm might not work well for a
binary classification task?
Decision Tree
Naïve Bayes
Logistic Regression
Linear Regression
In the Apriori algorithm of Association Analysis the support
counting step is expensive. Which of the following data structures
can be utilized to make the support counting step faster?
Queue
Linked List
Stack
Hash Tree
Which of the following statement about Random Forest classifier
is not true?
Bootstrap samples are generated from the training set to model
the base detectors
Random Forest is a linear classifier
At each internal node of a base decision tree p attributes are
randomly chosen
The decision trees used as base detectors are unpruned
Which of the following method is more suitable than random
selection for selecting the initial centroids of a k-means
clustering algorithm?
Select the first centroid at random, and pick the rest of the
centroids far away from the selected centroids.
Use hierarchical clustering to decide the initial
centroid.
Select the first centroid at random, and pick the rest of the
centroids as close as possible from the selected centroids.
Both (a) and (b)
training and testing data using 5-fold cross-validation, each
training set will contain (i)________ examples, each test set will
contain (ii)_______ examples and a total of
(iii)_________models will be generated.
(i) 1000, (ii) 4000, (iii) 10
(i) 1000, (ii) 4000, (iii) 5
(i) 4000, (ii) 1000, (iii) 5
(i) 4000, (ii) 1000, (iii) 10
Which of the following algorithm might not work well for a
binary classification task?
Decision Tree
Naïve Bayes
Logistic Regression
Linear Regression
In the Apriori algorithm of Association Analysis the support
counting step is expensive. Which of the following data structures
can be utilized to make the support counting step faster?
Queue
Linked List
Stack
Hash Tree
Which of the following statement about Random Forest classifier
is not true?
Bootstrap samples are generated from the training set to model
the base detectors
Random Forest is a linear classifier
At each internal node of a base decision tree p attributes are
randomly chosen
The decision trees used as base detectors are unpruned
Which of the following method is more suitable than random
selection for selecting the initial centroids of a k-means
clustering algorithm?
Select the first centroid at random, and pick the rest of the
centroids far away from the selected centroids.
Use hierarchical clustering to decide the initial
centroid.
Select the first centroid at random, and pick the rest of the
centroids as close as possible from the selected centroids.
Both (a) and (b)