Page 1 of 1

A media advertising firm manages a significant number of real-time messages originating from more than 200 different web

Posted: Thu Jul 21, 2022 10:00 pm
by answerhappygod
A media advertising firm manages a significant number of real-time messages originating from more than 200 different websites.
The company's data engineer needs to use Spark Streaming on Amazon Elastic MapReduce to collect and process information in real-time for analysis (EMR). The data engineer must adhere to a company directive.

As a major priority, retain ALL raw messages as they are received.

Which setup of Amazon Kinesis fits these requirements?

A. Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage Service (S3). Pull messages off Firehose with Spark Streaming in parallel to persistence to Amazon S3.
B. Publish messages to Amazon Kinesis Streams. Pull messages off Streams with Spark Streaming in parallel to AWS Lambda pushing messages from Streams to Firehose backed by Amazon Simple Storage Service (S3).
C. Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage Service (S3). Use AWS Lambda to pull messages from Firehose to Streams for processing with Spark Streaming.
D. Publish messages to Amazon Kinesis Streams, pull messages off with Spark Streaming, and write row data to Amazon Simple Storage Service (S3) before and after processing.