You are member of a data science team which wants to analyse the Opal Card transport data. This dataset contains million

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

You are member of a data science team which wants to analyse the Opal Card transport data. This dataset contains million

Post by answerhappygod »

You Are Member Of A Data Science Team Which Wants To Analyse The Opal Card Transport Data This Dataset Contains Million 1
You Are Member Of A Data Science Team Which Wants To Analyse The Opal Card Transport Data This Dataset Contains Million 1 (48.46 KiB) Viewed 15 times
You are member of a data science team which wants to analyse the Opal Card transport data. This dataset contains millions of tap-on / tap-off events where transport users were swiping on and off from trains, buses and ferries. The format is: CardEvents ( id, day, card, mode, time_top_on, time_tap_off ) Your colleague suggests to store this data partitioned over multiple computers to be able to parallelise the processing. In particular, he suggests to use horizontal hash partitioning on the card attribute. Your task in the project is to summaries the weekly usage data per transport mode (train, bus etc.). How well does your colleague's suggestion to use hash partitioning for the data help with your task? Edit View Insert Format Tools Table 12pt ✓ Paragraph BI U A DV T² V O To √x ♡ illi ||| A <
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply