Page 1 of 1

Question 19 5 pts During the semester, we used Jupyter notebooks to access and analyse large datasets from various sourc

Posted: Mon Jun 06, 2022 5:59 pm
by answerhappygod
Question 19 5 Pts During The Semester We Used Jupyter Notebooks To Access And Analyse Large Datasets From Various Sourc 1
Question 19 5 Pts During The Semester We Used Jupyter Notebooks To Access And Analyse Large Datasets From Various Sourc 1 (55.97 KiB) Viewed 31 times
Question 19 5 pts During the semester, we used Jupyter notebooks to access and analyse large datasets from various sources. For example, the following are two approaches to create a bar chart of the number of COVID vaccinations per vaccine type in the year 2021: Approach 1: import pandas as pd vaccinations = pd. read_csv ("covid_vaccinations.csv") vaccinations21 = vaccinations [vaccinations ['year'] == '2021'] result = vaccinations21.groupby('vaccine').count() result.plot.bar() Approach 2: import pandas as pd result = pd.read_sql("""SELECT vaccine, COUNT(*) FROM CovidVaccinations WHERE year = 2021 GROUP BY vaccine""", db_connection) result.plot.bar()

Both approaches use Python and Pandas, but they differ in where they access and analyse the data. Briefly explain in your own words the two approaches and their difference, and discuss how these two approaches differ if the vaccination dataset grows very large over time. Warning: Do not simply copy/paste definitions from the Internet - your answer must use your own thoughts and formulations.