Page 1 of 1

Note: This question is part of a series of questions that present the same Scenario.Each question I the series contains

Posted: Wed Aug 17, 2022 7:02 am
by answerhappygod
Note: This question is part of a series of questions that present the same Scenario.Each question I the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution while others might not have correct solution.You are implementing a batch processing solution by using Azure HDlnsight.You have a table that contains sales data.You plan to implement a query that will return the number of orders by zip code.You need to minimize the execution time of the queries and to maximize the compression level of the resulting data.What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format. E. Decrease the level of parallelism in an Apache Spark job that Mores the data in a text format. F. Use an action in an Apache Oozie workflow that stores the data in a text format. Azure Data Factory linked service that stores the data in Azure Data lake. Azure DocumentDB database.