You have a 928 MB file stored on HDFS as part of a Hadoop 2.x distribution. A data analytics program uses this file and

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

You have a 928 MB file stored on HDFS as part of a Hadoop 2.x distribution. A data analytics program uses this file and

Post by answerhappygod »

You Have A 928 Mb File Stored On Hdfs As Part Of A Hadoop 2 X Distribution A Data Analytics Program Uses This File And 1
You Have A 928 Mb File Stored On Hdfs As Part Of A Hadoop 2 X Distribution A Data Analytics Program Uses This File And 1 (63.56 KiB) Viewed 19 times
You have a 928 MB file stored on HDFS as part of a Hadoop 2.x distribution. A data analytics program uses this file and runs in parallel across the cluster nodes. [6 marks] a. The default block size and replication factor is used in the configuration. How many total blocks including replicas will be stored in the cluster ? What are the unique HDFS block sizes you will find for the specific file? b. The cluster has 64 cores to speed up the processing. If the program can at best achieve 60% parallelism in the code to exploit the multiple cores and the rest of it is sequential, what is the theoretical limit on speed-up you can expect with 64 cores compared to a sequential version of the same program running on one core with the same file? How will this limit change if you doubled the compute power to 128 cores? You can simplify the system to assume cluster nodes and cores mean the same and we can ignore the overheads of communication etc. depending on the specific cluster configuration, scheduling etc. c. Suppose you could use a more scalable algorithm with 80% parallelism and a larger file as you move to a 128 core system. What would be the theoretical speed-up limit for 128 cores ?
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply