4 Fuel Usage It is particularly difficult to identify outliers in this dataset, due, for example, to the multiple curren
-
answerhappygod
- Site Admin
- Posts: 899604
- Joined: Mon Aug 02, 2021 8:13 am
4 Fuel Usage It is particularly difficult to identify outliers in this dataset, due, for example, to the multiple curren
4 Fuel Usage It is particularly difficult to identify outliers in this dataset, due, for example, to the multiple currencies. As an example, refilling a vehicle in South Africa would be maybe R1000, but in the US it would be $70. One would have to either perform outlier detection on each currency separately. (We could convert everything into a single currency, based on the time of the transaction, and use that, however, that is not required in this assigment.) We will focus on the top five currencies only (Rands will be one of them) to simplify things. 4.1 Outlier Removal 1. Identify the top 5 currencies by number of transactions. [2] 2. For each of the top 5 currencies separately, remove outliers by considering the total spend, litres, cost per litre, gallons, etc. Choose values you believe are reasonable and provide your reasoning. As an example of something you would want to look out for, there are some SA users that have their currency set to dollars. This will show a user refuelling with several hundred dollars, but only putting in tens of litres, which is clearly wrong. [10] 3. How many values have been removed after accounting for outliers? [1] 3 3
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!