A customer's machine learning process entails numerous rapid-fire cycles of reads-writes-reads on Amazon S3. The client

Post by **answerhappygod** » Thu Jul 21, 2022 10:00 pm

A customer's machine learning process entails numerous rapid-fire cycles of reads-writes-reads on Amazon S3. The client needs to run the process on the EMR but is worried that future cycles' readings may miss important new data from the previous cycles' machine learning.

How should the consumer go about doing this?

A. Turn on EMRFS consistent view when configuring the EMR cluster.
B. Use AWS Data Pipeline to orchestrate the data processing cycles.
C. Set hadoop.data.consistency = true in the core-site.xml file.
D. Set hadoop.s3.consistency = true in the core-site.xml file.

This topic has 1 reply

You must be a registered member and logged in to view the replies in this topic.

Register Login