Now since you have dealt with the data set, I assume you arecomfortable reading the data frames using Pandas. The analysis ofreal OR randomly generated dataset is conducted in the same manner,e.g., cleaning a dataset and frequency/distribution plots. But ifthe data is random having very high noise values, you would mostprobably know in this step as it would not follow any distributionpattern, and frequency would be evenly distributed.
(NOTE: Noise values are the measurements errors orreadings due to humans or another external factor, please do notconfuse the noise with garbage values or empty cells which wecleaned as a first step)
Suppose you are analyzing the performance of an Ingressfirewall, and you are asked as an analyst to find out if thefirewall performance can be enhanced, or latency can be removed orreduced.
As an analyst, you would require information about threeparameters. A number of total rules in a firewall, a number oftimes each rule has been hit/triggered (network packets containheader info and the task of a firewall is to match header info withthe rules in a firewall and confirm if the packets can be allowedor blocked), so each allow/deny request can be noted down. Andfinally the average time against each rule.
Hence your mission in this assignment is to populate these threecolumns with 1000 random values in each column. You could usePandas to randomly populate values in columns or you could useExcel/other ways. In the end, I want you to take a screenshot ofthe randomly generated data, as you did in the previous assignment,and upload it. Just populate and plot no further analysis isrequired at this point.
Now since you have dealt with the data set, I assume you are comfortable reading the data frames using Pandas. The analy
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am