Question 2 (20 marks) 2 To develop an algorithm which can identify whether a banknote is genuine or fake, data were extr

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

Question 2 (20 marks) 2 To develop an algorithm which can identify whether a banknote is genuine or fake, data were extr

Post by answerhappygod »

Question 2 20 Marks 2 To Develop An Algorithm Which Can Identify Whether A Banknote Is Genuine Or Fake Data Were Extr 1
Question 2 20 Marks 2 To Develop An Algorithm Which Can Identify Whether A Banknote Is Genuine Or Fake Data Were Extr 1 (200.59 KiB) Viewed 37 times
Question 2 20 Marks 2 To Develop An Algorithm Which Can Identify Whether A Banknote Is Genuine Or Fake Data Were Extr 2
Question 2 20 Marks 2 To Develop An Algorithm Which Can Identify Whether A Banknote Is Genuine Or Fake Data Were Extr 2 (95.59 KiB) Viewed 37 times
Question 2 (20 marks) 2 To develop an algorithm which can identify whether a banknote is genuine or fake, data were extracted from images that were taken from genuine (G) and fake (F) banknotes. An image processing tool was then use to extract the following variables: Variable Description variance Describes how each pixel varies from the neighbouring pixels skewness A measure of the lack of symmetry entropy Amount of information which must be coded for by a compression algorithm class G = genuine, F = fake An analyst would like to use K-Means clustering to study the characteristics of the notes. a What is the value of k to be used for K-Means clustering? Briefly explain. 2 marks b 2 marks A parallel coordinates plot of the data is given below. Comment on any attributes which can help characterize the clusters. F G variance skewness entropy class

с Explain why normalization is important in K-Means clustering. 2 marks d 4 marks A small subset of the data is given below. Use Min-max normalization to fill in the blanks (A), (B), (C) and (D) below. Min-Max Normalisation ID Variance Skewness Class 1 1.635 3.286 2 3.23 7.838 G 3 3.912 2.974 G 4 3.78 -3.311 G 5 -1.6 -9.583 F 6 -3.59 -6.572 F 7 -0.878 3.257 F ID Variance Skewness Class 1 (B) 0.739 G 2 0.909 1 G 3 1 0.721 G 4 0.982 0.360 G 5 0.262 (D) F 6 0 0.173 F 7 0.361 0.737 F T) T) (i) (B) 11 = 1.635-(-3.59) (A) -9.583-(C) 17.421 (ii) (D) (A) (B) (C) (D)

Statistics and Analytics for Engineers MS2215/MS4215/MS6215 e 4 marks A scatterplot of the data in part (d) above with two clusters is given below. The cluster centroid, F is (0.208,0.303). Write down the cluster centroid for cluster G. Show your workings clearly. (Hint: Refer to table in part (d) above) 0.8 A * 0.6 0.4 A 0-2 -0.2 12 014 016 0.8 1.2 -0.2 f 6 marks Suppose a new note has the measurements variance = 2.20 and skewness = 6.00 (before standardization). Compute the Euclidean distance of the new note from each of the centroids, F and G. Which cluster is the new note likely to belong to? Explain and show your workings clearly.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply