A novel method for identifying SPAM e-mails has been developed in Python. The program examines the unstructured text inc
Posted: Thu Jul 21, 2022 10:00 pm
A novel method for identifying SPAM e-mails has been developed in Python. The program examines the unstructured text included in a sample set of one million e-mails hosted on Amazon S3. The method must be scaled to a 5 PB production dataset that is likewise stored in Amazon S3.
Which AWS service plan is most appropriate for this scenario?
A. Copy the data into Amazon ElastiCache to perform text analysis on the in-memory data and export the results of the model into Amazon Machine Learning.
B. Use Amazon EMR to parallelize the text analysis tasks across the cluster using a streaming program step.
C. Use Amazon Elasticsearch Service to store the text and then use the Python Elasticsearch Client to run analysis against the text index.
D. Initiate a Python job from AWS Data Pipeline to run directly against the Amazon S3 text files.
Which AWS service plan is most appropriate for this scenario?
A. Copy the data into Amazon ElastiCache to perform text analysis on the in-memory data and export the results of the model into Amazon Machine Learning.
B. Use Amazon EMR to parallelize the text analysis tasks across the cluster using a streaming program step.
C. Use Amazon Elasticsearch Service to store the text and then use the Python Elasticsearch Client to run analysis against the text index.
D. Initiate a Python job from AWS Data Pipeline to run directly against the Amazon S3 text files.