A media advertising company handles a large number of real-time messages sourced from over 200 websites. The company’s data engineer needs to collect and process records in real time for analysis using Spark Streaming on Amazon Elastic MapReduce (EMR). The data engineer needs to fulfill a corporate mandate to keep ALL raw messages as they are received as a top priority.
Which Amazon Kinesis configuration meets these requirements?
A . Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage
Service (S3). Pull messages off Firehose with Spark Streaming in parallel to persistence to Amazon S3
B . Publish messages to Amazon Kinesis Streams. Pull messages off Stream with Spark Streaming in parallel to AWS messages from Streams to Firehose backed by Amazon Simple Storage Service (S3)
C . Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage (S3).
Use AWS Lambda messages from Firehose to Streams for processing with Spark Streaming
D . Publish messages to Amazon Kinesis Streams, pull messages off with Spark Streaming and write data new data to Amazon Simple Storage Service (S3) before and after processing
Answer: C