I need help with parts b, c and d in the question below. My answer to part (a) is attached in the bottom figure below th

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

I need help with parts b, c and d in the question below. My answer to part (a) is attached in the bottom figure below th

Post by answerhappygod »

I need help with parts b, c and d in the question below. My
answer to part (a) is attached in the bottom figure below the
question. Thanks
I Need Help With Parts B C And D In The Question Below My Answer To Part A Is Attached In The Bottom Figure Below Th 1
I Need Help With Parts B C And D In The Question Below My Answer To Part A Is Attached In The Bottom Figure Below Th 1 (92.18 KiB) Viewed 19 times
I Need Help With Parts B C And D In The Question Below My Answer To Part A Is Attached In The Bottom Figure Below Th 2
I Need Help With Parts B C And D In The Question Below My Answer To Part A Is Attached In The Bottom Figure Below Th 2 (91.93 KiB) Viewed 19 times
In the following table, we have 5 instances with 3 attributes Suburb, Area, New, a Class Label. Each row is showing an instance. (N.B. Calculations up to two decimal points) Suburb Area New Class 1 S1 Large N 1 2 S2 Large N 1 3 S3 Large Y 1 4 S4 Large Y 2 5 S5 Medium Y 2 6 S6 Large Y 3 7 S4 Large Y 3 8 S7 Small N 3 (a) Calculate the information gain and gain ratio of "New" feature on the dataset. [7 marks] (N.B. use log₂ to compute the results of each step to get full marks.) (b) Does a decision tree exist, which can perfectly classify the given instances? If yes, draw that decision tree, otherwise, explain why not, by referring to the data. [2 marks] (c) If we use "Area" to build a decision stump, what is the the predicted label of decision stump for each of the 8 instances in the data set? [4 marks] (d) If we use "Suburb" to build a decision stump, what would you expect to see for the accuracy of the decision stump given an evaluation dataset that you have not seen before? Explain why the stump has good/bad accuracy. [2 marks]

ܕܐ [ Q/_ H (had) = - = 4, ²+ = log 3 + 1 kg, 3 ) = 1.5 Q6/a) 8 1·5 H(N) = -√ == /₂₁ = + = log₂ = +0] = 0.918 log₂ to H (Y) = − 1 = log₂ = + = = log₂ == + = log₂ ²3 ) = 1.52 mean info (New) = P(X) H(Y) + P(N) H(N) = 동 X 1:52 + 롱 x 0.918 = 1.29 H (Root) - mean info (New) = 1.5 -1.29 = 0.21 SI = - P(N) log₁ P(N) + P(Y) log₂ (P(X)) = -( / / log₂ = = + = log₂ & T = 0.95 Grain Ratio = = 0-21 = 0.22 0.95 14 ST New N N MY y Y Y Y N Class 1 1 2 2 3 3 3
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply