Is Kafka Only a Big Data Tool?, Big Data Institute

I’ve been teaching Kafka at companies without the textbook definition of Big Data problems. They don’t have, and will not have in the future, what you’d define as Big Data problems. As a result, the students ask me if using Kafka is appropriate for their use cases. Put another way, is Kafka only a Big Data tool?

For most Big Data technologies, not having or having a Big Data problem in the future is the reason not to use technologies like Apache Hadoop or Apache Spark. It’s a pretty clear pass/fail because the technical and operational overhead of these projects immediately negates any other benefits. Using Big Data for small data isn’t just massive overkill; it’s going to waste a lot of time and money.

For Kafka, it’s different. I define Kafka as a distributed publish subscribe system. Companies without clear Big Data problems are gaining value from it. They’re able to use the other interesting features of Kafka.

Here are some of the pros I see for using Kafka with small data:

All data can be replicated to more than one computer
Kafka removes single points of failure for the brokers
Kafka removes single points of failure for consumers with consumer groups
Consumers can move freely through the commit log and go back in time
Consumers don’t miss data as a result of downtime because the data is saved

Here are some of the cons I see for using Kafka compared to a traditional small data pub/sub:

Programmatic API is more complex than others
Conceptually more complex (e.g. partitions and offsets) than others
Ordering is no longer global and is only on a partition basis
Consumer groups will need to handle state transitions for failures
Fewer people available with Kafka skills (you will probably need to train)
Operationally, more processes will need to be monitored

With these pros and cons in mind, you can make a choice between Kafka and your small data pub/sub of choice. If the pros are really compelling and outweigh the cons, I suggest you start looking at Kafka. If the cons outweigh, you’re probably better off with your small data pub/sub.

Learn more about how Kafka works here:

Is Kafka Only a Big Data Tool?

Want to become a Data Engineer but can't find in-depth materials?

You have Successfully Subscribed!

Get your free copy of Data Engineering Teams: Creating Successful Big Data Teams and Products

Data Engineering Teams Book

Would you like to know what I teach successful organizations to do?

Mentoring

We’re here to help make the process more successful and the outcome more effective.

Architecture Reviews

The right tool for the job saves countless hours, time, money. Are you using the right tool for the job?

Project Acceleration

Why do so few companies create enormous value from Big Data while most fail?

Company

Resources

Resources

Stay updated with the latest.

Have a question?

Send us a message

or give us a call at +1 775.393.9122

© 2025 Big Data Institute

Privacy

© 2025 Big Data Institute

Privacy

Have a question?

Send us a message

or give us a call at +1 775.393.9122