A Spark application has a 128 GB DataFrame A and a 1 GB DataFrame B. If a broadcast join were to be performed on these t

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899559
Joined: Mon Aug 02, 2021 8:13 am

A Spark application has a 128 GB DataFrame A and a 1 GB DataFrame B. If a broadcast join were to be performed on these t

Post by answerhappygod »

A Spark application has a 128 GB DataFrame A and a 1 GB DataFrame B. If a broadcast join were to be performed on these two DataFrames, which of the following describes which DataFrame should be broadcasted and why?

A. Either DataFrame can be broadcasted. Their results will be identical in result and efficiency.
B. DataFrame B should be broadcasted because it is smaller and will eliminate the need for the shuffling of itself.
C. DataFrame A should be broadcasted because it is larger and will eliminate the need for the shuffling of DataFrame B.
D. DataFrame B should be broadcasted because it is smaller and will eliminate the need for the shuffling of DataFrame A.
E. DataFrame A should be broadcasted because it is smaller and will eliminate the need for the shuffling of itself.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!

This question has been solved and has 1 reply.

You must be registered to view answers and replies in this topic. Registration is free.


Register Login
 
Post Reply