Posted by: Pdfprep
Post Date: January 29, 2021
A customer has a machine learning workflow that consist of multiple quick cycles of reads-writes-reads on Amazon S3. The customer needs to run the workflow on EMR but is concerned that the reads in subsequent cycles will miss new data critical to the machine learning from the prior cycles.
How should the customer accomplish this?
A . Turn on EMRFS consistent view when configuring the EMR cluster
B . Use AWS Data Pipeline to orchestrate the data processing cycles
C . Set Hadoop.data.consistency = true in the core-site.xml file
D . Set Hadoop.s3.consistency = true in the core-site.xml file
Answer: A
Leave a Reply