Page 1 of 1

A customer's machine learning process entails numerous rapid-fire cycles of reads-writes-reads on Amazon S3. The client

Posted: Thu Jul 21, 2022 10:00 pm
by answerhappygod
A customer's machine learning process entails numerous rapid-fire cycles of reads-writes-reads on Amazon S3. The client needs to run the process on the EMR but is worried that future cycles' readings may miss important new data from the previous cycles' machine learning.

How should the consumer go about doing this?

A. Turn on EMRFS consistent view when configuring the EMR cluster.
B. Use AWS Data Pipeline to orchestrate the data processing cycles.
C. Set hadoop.data.consistency = true in the core-site.xml file.
D. Set hadoop.s3.consistency = true in the core-site.xml file.