Medium Data

Most companies aren’t experiencing Big Data or small data problems. They’re experiencing a witching hour of sorts. This a point in their growth where their data is too big for small data and too small for Big Data. As I’m teaching at companies, I’m finding as much as...

Beam 2.0 Q and A

Apache Beam just had its first API stable release. Now that we have an API stable release, I want to update what’s changed in the Beam ecosystem. I want to highlight the growth of Beam as a project and the increased usage of Beam in pre-production/development or...

How to Evaluate an Open Source Product

Open source is a great way to solve problems. Mostly we focus on the open source project from a technical and architectural points of view. In this post, I’m going to talk about it from a business point of view. Sometimes you’re look through 3-10 different open source...

Kafka Topic Design Checklist

Designing data for consumption in a Kafka topic requires more forethought. Instead of the messages being a consumed from point to point, there are many different consumers. You will need to decide on: Name Schema Contents Key/Ordering Number of Partitions Number of...