Page 1 of 1

PACE UNIVERSITY Summer 2022 - CS 673: Scalable Databases Assignment #3 Total Points: 100 Using the data set which has cu

Posted: Fri Jul 01, 2022 5:37 am
by answerhappygod
Pace University Summer 2022 Cs 673 Scalable Databases Assignment 3 Total Points 100 Using The Data Set Which Has Cu 1
Pace University Summer 2022 Cs 673 Scalable Databases Assignment 3 Total Points 100 Using The Data Set Which Has Cu 1 (50.83 KiB) Viewed 49 times
PACE UNIVERSITY Summer 2022 - CS 673: Scalable Databases Assignment #3 Total Points: 100 Using the data set which has customers data perform the following tasks. Points 20 Task Count the total amount ordered by each customer in Scala using RDD. • Split each comma-delimited line into RDD • Map each line to key/value pairs to cust_id and amount spent • Use reduce by key to add up amount for each customer Collect() the results and print List the customer who spent highest amount. List Top 5 customers based on the amount which they have spent, along with the customer ID and Product ID. List the Bottom 5 customers based on the amount spent, along with the customer ID and Product ID. Find out customers who spent an average amount of the total amount spent in data. Create a function to give rewards ($5) to all the customers who spent more than the average of the total amount. Sort the customers based on the amount spent (high to low) List the product ID's of Top 5 customers who have purchased List most sold product ID's Submission Details: 10 10 10 CS673 - Summer 2022-Kaleema 10 10 10 10 10 Submit the link of Databricks • Late submission up to one week will incur 10% of total points earned. • Attach this file with self-assessment Plagiarism will be checked, up to 15% similarity score is acceptable. NOTE : Extra credit for the good documentation - 5 Points. Self- assessment