Use Cases and Deployment Scope
We are using the Apache Hadoop to handle the data which is continuously coming from different devices in real time from different geographical location across the globe and then run spark jobs and notebook to ingest the data and process it and then load it other external systems for further processing.
Alternatives Considered
Amazon EMR (Elastic MapReduce)
Other Software Used
Amazon EMR (Elastic MapReduce), Amazon RDS on VMware, Amazon EventBridge, Amazon Managed Streaming for Apache Kafka (Amazon MSK), Apache Kafka, Google Compute Engine, Apache Airflow, Amazon Managed Streaming for Apache Kafka (Amazon MSK)