TrustRadius: an HG Insights company

Apache Kafka

Score7.8 out of 10

137 Reviews and Ratings

What is Apache Kafka?

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

The versatile Apache Kafka

Use Cases and Deployment Scope

We use Apache Kafka for asynchronous communication.

For any processing that we need to do on background, we use Apache Kafka. We also set the configurations in such a way so that we can use it for retrying messages in a topic.

We also use it for data streaming which powers our data platform.

Pros

  • Its extremely fast. It is able to deliver messages very quickly.
  • It is very reliable, I have not yet seen any cases where messages might have dropped
  • Using different configurations we can model it any way and cater to large number of business use cases.

Cons

  • If there can be some way of scheduling messages to reappear that would be great.
  • There should be functionality of decreasing the partitions on the fly so that we can scale down when needed.
  • Apache Kafka should have better consumer UI view so that we can more details on the consumers attached.

Return on Investment

  • It has reduced overhead of managing a different system for maintaining infra of asynchronus programming.
  • Due to its simplicity and reliability we don't have to invest much on training engineers on it.

Usability

Alternatives Considered

Amazon Simple Queue Service (SQS) and RabbitMQ

Other Software Used

MySQL, RabbitMQ, Kubernetes

Apache Kafka - Default Choice For Large Scale Messaging

Use Cases and Deployment Scope

Apache Kafka is really the bedrock of all things streaming and data processing. I cannot imagine if there is any other product that does it better. My last 2 companies used it, and my current one does so as well. If you want your data stream to be organized and sent, Apache Kafka has become the tool of choice. I have dabbled in Azure EventHubs as well, if you are into opensource data streaming, Apache Kafka will take you where you need to be for data lakes and the amount of data that is streamed for the cybersecurity industry that my company is in. Without Apache Kafka, there is no way that my company products can handle the volume of data that we process for our customers.

Pros

  • Data streaming is really second to none.
  • Scaling, done right, Apache Kafka is a workhorse.
  • Ease of administration - Although you cannot really compare to Azure EventHubs, but that is comparing between Apples and Oranges.

Cons

  • The web UI has not really changed in years. UX has been refreshed, but a more streamlined UX instead of many 3rd party webUX tools, will be most welcome.
  • Webhooks can still be tricky to troubleshoot at times.
  • CLI monitoring is a learning curve to get it right.

Most Important Features

  • Well known and known set of tools from setup to admin.
  • Scalability.
  • Fit for use in both onprem, and cloud-base use cases.

Return on Investment

  • Being an open-source tool, Apache Kafka is invaluable to my company's product. I cannot imagine how much it is if we are using Amazon Kinesis or Azure EventHubs.
  • The negative part will be in the event of Apache Kafka failures, the trouble-shooting can really be a pain and bane. But given enough exposure to its inner workings, Apache Kafka still comes out OK.
  • Having used Apache Kafka for years in this company, I can only say without Apache Kafka, my company would not be cost-efficient and would be much more costlier to sell to customers if we were paying on top of Azure Event hubs or Amazon Kinesis.

Alternatives Considered

RabbitMQ

Other Software Used

Amazon Elastic Kubernetes Service (EKS), Apache Spark, Amazon Elastic Compute Cloud (EC2)

Kafka for tracking changes

Use Cases and Deployment Scope

We use Apache Kafka to stream order information across systems. An order may go through certain updates through its lifecycle. These updates need to be communicated to the systems in near real time and we rely on Kafka for this.Our business use case is to take these orders up with the insurance companies for approval and thus the order information need to be up to date. Kafka has been excellent at doing this so far.

Pros

  • Receiving messages from publisher and sending to consumer in FIFO manner
  • Handling of errors using Dead Letter Queue when message could not be consumed on the consumer end
  • Fault tolerance

Cons

  • Sometimes it becomes difficult to monitor our Kafka deployments. We've been able to overcome it largely using AWS MSK, a managed service for Apache Kafka, but a separate monitoring dashboard would have been great.
  • Simplify the process for local deployment of Kafka and provide a user interface to get visibility into the different topics and the messages being processed.
  • Learning curve around creation of broker and topics could be simplified

Most Important Features

  • High throughput
  • Low latency
  • Fault tolerance

Return on Investment

  • We are able to submit orders to the insurance companies with almost 100% accuracy because we receive Kafka updates in almost real time
  • We are getting notified of error scenarios separately because of our Dead Letter Queue implementation so that we can handle those cases
  • There is certain engineering effort being spent to maintain Kafka

Other Software Used

Prometheus, Apache ZooKeeper, Grafana

A Deep Dive into the Power and Potential of Apache Kafka

Use Cases and Deployment Scope

We use Apache Kafka as an event bus for all our async activities & Micro Service Communication, like sending emails, SMS, and notifications between services and consumers and for event & data processing.

Pros

  • Event driven architectures
  • Any use case which requires async data processing
  • Any use case with production and consuming the same data to build business-specific processing

Cons

  • Zookeeper services configuration can be simplified
  • Data logging needs to be secured
  • Restarting & overall management needs to be improved

Most Important Features

  • Event-Driven Architecture
  • Super fast processing capability & data retention
  • Fault Tolerance & High Volume Processing

Return on Investment

  • Faster Implementation of Business Logics has improved productivity by 30%
  • Clean Architecture leads to fewer bugs and improved business

Alternatives Considered

Amazon Kinesis, Amazon Simple Queue Service (SQS) and RabbitMQ

Other Software Used

Amazon Aurora, Amazon Athena, NGINX, Apache Airflow

Apache Kafka is awesome! Tricky sometimes, but we love it!

Pros

  • The pub/sub model
  • Quick data transfer - regardless of volume (if you have enough resources)
  • Ability to transfer large amounts of data consistently (non-binary)

Cons

  • The Kafka Tool is a community-made Java application that looks and feels from the past century.
  • Logging can be confusing. This certainly shows when we have to do troubleshooting.
  • Hybrid scenarios - pub/sub, but there are services in and outside a Kubernetes cluster. Then there are a ~3 options, but only 2 (the harder ones) are production-safe.

Return on Investment

  • Positive: Get a quick and reliable pub/sub model implemented - data across components flows easily.
  • Positive: it's scalable so we can develop small and scale for real-world scenarios
  • Negative: it's easy to get into a confusing situation if you are not experienced yet or something strange has happened (rare, but it does). Troubleshooting such situations can take time and effort.

Alternatives Considered

Redis and Enterprise Fluentd

Other Software Used

Microsoft Teams, Jira Software, Atlassian Confluence