When You Have the Wrong Team for Big Data

In my book, Data Engineering Teams, I talk about the right skills and people to be on a data engineering team. The right skills and people are incredibly important to the success, or failure, of a Big Data project. Sometimes it’s easier to understand this point with...

Integration Testing for Kafka

We’re creating more and more complicated data pipelines and systems with Kafka. These interactions are becoming even more complex as we create microservices. As we create these complex systems, we aren’t thinking about how to test, debug, or fix them. These 3 parts...

Two Halves Don’t Make a Whole

In Chapter 3 of my Data Engineering Teams book, I show you how to do a skill gap analysis. During the analysis of the team, you either say the person has the skill or not. It’s a very binary decision. Some people have written me asking if it can be a fraction. Instead...

Apache Kafka and Amazon Kinesis

This post will focus on the key differences a Data Engineer or Architect needs to know between Apache Kafka and Amazon Kinesis. Cloud vs DIY Some of the contenders for Big Data messaging systems are Apache Kafka, Amazon Kinesis, and Google Cloud Pub/Sub (discussed in...

This is Useless (Without Use Cases)

Sometimes I’ll write a post and the comments will say something to the effect of “this is useless.” Other times I’ll be finishing up a class and a student will ask me why I didn’t cover what they’re trying to. I’ve written example code and people will ask me why...