- You Are Member Of A Data Science Team Which Wants To Analyse The Opal Card Transport Data This Dataset Contains Million 1 (48.46 KiB) Viewed 15 times
You are member of a data science team which wants to analyse the Opal Card transport data. This dataset contains million
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am
You are member of a data science team which wants to analyse the Opal Card transport data. This dataset contains million
You are member of a data science team which wants to analyse the Opal Card transport data. This dataset contains millions of tap-on / tap-off events where transport users were swiping on and off from trains, buses and ferries. The format is: CardEvents ( id, day, card, mode, time_top_on, time_tap_off ) Your colleague suggests to store this data partitioned over multiple computers to be able to parallelise the processing. In particular, he suggests to use horizontal hash partitioning on the card attribute. Your task in the project is to summaries the weekly usage data per transport mode (train, bus etc.). How well does your colleague's suggestion to use hash partitioning for the data help with your task? Edit View Insert Format Tools Table 12pt ✓ Paragraph BI U A DV T² V O To √x ♡ illi ||| A <