Question 6 (3 marks) In this question we use the same file hightemp.txt as in Question 1 of the examination paper. A fil

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

Question 6 (3 marks) In this question we use the same file hightemp.txt as in Question 1 of the examination paper. A fil

Post by answerhappygod »

Question 6 3 Marks In This Question We Use The Same File Hightemp Txt As In Question 1 Of The Examination Paper A Fil 1
Question 6 3 Marks In This Question We Use The Same File Hightemp Txt As In Question 1 Of The Examination Paper A Fil 1 (79.39 KiB) Viewed 28 times
hightemp.txt
Question 6 3 Marks In This Question We Use The Same File Hightemp Txt As In Question 1 Of The Examination Paper A Fil 2
Question 6 3 Marks In This Question We Use The Same File Hightemp Txt As In Question 1 Of The Examination Paper A Fil 2 (11.95 KiB) Viewed 28 times
Question 6 (3 marks) In this question we use the same file hightemp.txt as in Question 1 of the examination paper. A file hightemp.txt contains information about the highest temperatures recorded every day in a number of cities all over the world. The file higtemp.txt is a text file where information about the highest temperature recorded on a given day, in a given city is stored in a single row. Data items like date, temperature and city name are separated with a single blank. A file hightemp.txt has been uploaded to HDFS at a location /bigdata/hightemp. (1) Load the contents of a file hightemp.txt located in HDFS into a Resilient Distributed Dataset (RDD) and use RDD to find an average temperature in Sydney in 2020. (1 mark) (2) Load the contents of a file hightemp.txt located in HDFS into a Dataset and use the Dataset to find the total number of temperature measurements per city. (1 mark) (3) Load the contents of a file hightemp.txt located in HDFS into a DataFrame and use SQL to find an average temperature per city and per year and city. (1 mark)

01-JAN-1991 25 Sydney 01-JAN-1991 30 Brisbane 32 Singapore. 01-JAN-1991 02-JAN-1991 02-JAN-1991 02-JAN-1991 25 Sydney 31 Brisbane 35 Singapore 05-JUN-2022 15 Sydney 05-JUN-2022 20 Brisbane 05-JUN-2022 25 Singapore
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply