Sonia is a manager at a health insurance company. She needs to identify her company's customers who may be most at risk
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am
Sonia is a manager at a health insurance company. She needs to identify her company's customers who may be most at risk
company. She needs to identify her company's customers who may be most at risk of developing coronary heart disease. Customers identified at risk would then be invited to enroll in a pilot health management program to help them avoid heart disease through dietary and exercise initiatives. Sonia, however, likes to limit the number of participants in the pilot program in order to minimize cost. Also, another of Sonia's objectives is to enroll those customers in the pilot program who are most at risk of coronary heart disease. Sonia has customer data which contains 547 records and features (columns) such as gender, weight and cholesterol. High weight and high cholesterol are generally associated with coronary heart disease. Sonia has conducted k-means cluster analysis on the customer data three times, with k = 3, k = 4, and k = 5. Results of the three versions of the cluster analysis are copied below for your reference. K = 3 Analysis Results Cluster 0: 191 items Cluster 1: 185 items Cluster 2: 171 items 2 Total number of items: 547 Attribute cluster_0 cluster 1 cluster_2 Weight 110.461 141.995 182.263 Cholesterol 125.979 173.249 217.041 Gender 0.550 0.411 0.585 K=4 Analysis Results Attribute duster 1 cluster 2 cluster_0 127.726 cluster_3 184.318 Weight 106.850 152.093 Cholesterol 154.385 119.536 185.907 218.916 Gender 0.459 0.543 0.441 0.591 Cluster : 135 items Cluster 1: 140 items Cluster 2: 118 items Cluster 3: 154 items Total number of items: 547 K=5 Analysis Results Cluster : 107 items Cluster 1: 101 items 1: Cluster 2: 104 items Cluster 3: 110 items Cluster 4: 125 items Total number of items: 547 Attribute cluster_0 cluster 1 cluster 3 cluster 4 cluster_2 120.260 Weight 104.618 139.028 168.551 160.525 196.861 187.440 221.680 Cholesterol 142.827 115.864 Gender 0.430 0.535 0.490 0.582 0.528 1. Which version of the analysis (j.e., k=3, k=4, or k=5) is better in meeting Sonia's objectives? 2.Consider the version of the analysis you selected in Question. 1 above. Which cluster of customers from the selected version of analysis should be enrolled in the pilot program? Explain why you selected this particular cluster
Sonia is a manager at a health insurance