Q2. Perceptrons Instead of using Naive Bayes, you decide to try applying Perceptron to the interrogation data. You gener

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

Q2. Perceptrons Instead of using Naive Bayes, you decide to try applying Perceptron to the interrogation data. You gener

Post by answerhappygod »

Q2 Perceptrons Instead Of Using Naive Bayes You Decide To Try Applying Perceptron To The Interrogation Data You Gener 1
Q2 Perceptrons Instead Of Using Naive Bayes You Decide To Try Applying Perceptron To The Interrogation Data You Gener 1 (213.99 KiB) Viewed 104 times
Q2. Perceptrons Instead of using Naive Bayes, you decide to try applying Perceptron to the interrogation data. You generate features from the training data as follows: • 01(x) = n where "not" appears n times. • 42(X) = m where "swear" appears m times. • 3(x) = 1, a bias term We use the labels +1 for G and -1 for I. Given a weight vector w = (W1, W2, W3), our classifier returns +1 if W101(x) + W242(x) + W3 >= 0 and 1 otherwise. Our training set from part 1 yields the following features and labels: Training Statements I am definitely innocent, officer Officer, I swear I am not lying I am not lying, I swear I am innocent, officer, I swear Officer, I am definitely not lying 1 0 1 1 0 1 02 0 1 1 1 0 P3 1 1 1 1 1 Label +1 +1 +1 -1 -1 W1 1 (a) [2 pts] Compute the first two updates of the Perceptron algorithm and fill in the following table, using the given initial Perceptron weights w = (W1, W2, W3) and data points (41, 42, 43, Label). w W2 W3 Initial 2 -0.5 Observing (0,0,1, +1) Observing (1,0, 1,-1) (b) [2 pts] What convergence guarantees can you give for the Perceptron algorithm applied to this data set? (c) [2 pts) Linear classifiers are often insufficient to represent a dataset using a given set of features. However, it is often possible to find new features using nonlinear functions of our existing features which do allow linear classifiers to separate the data. Nonlinear features result in more expressive linear classifiers. For example, consider the following data set, where t's represent positive examples andt's represent negative examples. o + -1 +1 No linear classifier can separate the positive examples (-1,0) and (1, 0) from the negative example (0,0). Rather than using a single feature, if we perform a nonlinear mapping 9(x1, x2) = (x2, 1), the positive examples are both mapped to (1, 1) and the negative example is mapped to (0,1), and we see the data can be separatedby a linear classifier. One example is the line w = [1, -0.5), i.e. the classifier wło(x) = x2 - 0.5 >= 0. +1 + 1. 0 -1 -1 0 +1 For what values of the weight vector w = (W1, w2) does the classifier w'$(x) >= 0 separate the given data? (d) [3 pts) Which of the following feature sets allows linear classifier w = (W1, W2, W3) to separate the original interrogation data set? Justify your answer briefly. (i) [1 pt] 4' = (01 + 4201 - 421) (ii) [1 pt] o' = (010201) (iii) [1 pt] ' = ((1 xor 02), 02, 1) where a xor b is 1 if either a = 1 or b = 1 but not both. (e) (2 pts] Given the features p(x) = [x?, x, 1], how many data points are we guaranteed to be able to separate with zero error using a linear classifier w'$(x) = wix? + w2x + wz? Assume that a data point x cannot have conflicting labels. Justify your answer briefly. (1) [2 pts] In general, if we use features y(x) = [XN-1, xN-2,..., 1], i.e. an N – 1th order polynomial, how many points can we separate with zero error using a linear classifier w = [W1, ..., wn]? Justify your answer briefly. (g) [2 pts] Assume we have N labeled training data points, which we would like to use for classification i.e. to predict the labels of unseen test data points. What are the disadvantages of using an Nth order polynomial to fit this data?
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply