Question 19 5 pts During the semester, we used Jupyter notebooks to access and analyse large datasets from various sources. For example, the following are two approaches to create a bar chart of the number of COVID vaccinations per vaccine type in the year 2021: Approach 1: import pandas as pd vaccinations = pd. read_csv ("covid_vaccinations.csv") vaccinations21 = vaccinations [vaccinations ['year'] == '2021'] result = vaccinations21.groupby('vaccine').count() result.plot.bar() Approach 2: import pandas as pd result = pd.read_sql("""SELECT vaccine, COUNT(*) FROM CovidVaccinations WHERE year = 2021 GROUP BY vaccine""", db_connection) result.plot.bar()
Both approaches use Python and Pandas, but they differ in where they access and analyse the data. Briefly explain in your own words the two approaches and their difference, and discuss how these two approaches differ if the vaccination dataset grows very large over time. Warning: Do not simply copy/paste definitions from the Internet - your answer must use your own thoughts and formulations.
Question 19 5 pts During the semester, we used Jupyter notebooks to access and analyse large datasets from various sourc
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am