Daily depletion reports from the field are received by a major food distributor in the form of gzip archives or CSV file

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899604
Joined: Mon Aug 02, 2021 8:13 am

Daily depletion reports from the field are received by a major food distributor in the form of gzip archives or CSV file

Post by answerhappygod »

Daily depletion reports from the field are received by a major food distributor in the form of gzip archives or CSV files uploaded to Amazon S3. The files are between 500MB and 5GB in size. Each day, these files are processed by an EMR task.

Recently, it has been noted that file sizes fluctuate and EMR tasks take an excessive amount of time. With this little information, the distributor must adjust and optimize the data processing workflow in order to enhance the EMR job's performance.

Which suggestion is appropriate for an administrator to make?

A. Reduce the HDFS block size to increase the number of task processors.
B. Use bzip2 or Snappy rather than gzip for the archives.
C. Decompress the gzip archives and store the data as CSV files.
D. Use Avro rather than gzip for the archives.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!

This topic has 1 reply

You must be a registered member and logged in to view the replies in this topic.


Register Login
 
Post Reply