As I’ve worked with software teams, I’ve found some interesting views on distributed systems. Some teams think they’re creators of distributed systems. They usually aren’t.
I think there are three main groups of teams that interact with distributed systems: users of end data products, users of existing distributed system frameworks, and creators of distributed systems frameworks.
These nuances make a big difference in how a team interacts with distributed systems. For example, a team that uses end data products will fail if they try to create their own distributed system. This is one of the more common ways I’ve seen teams fail with Big Data.
Users of End Data Products
Users of end data products are the people who work with already created data pipelines and data products. These teams may be DBAs/SQL-focused or a software engineering team. The difficult parts of the distributed systems creation is done for them. They’re given the data in an already usable form.
Users of Existing Distributed System Frameworks
Users of existing distributed systems frameworks are the people who use open source or other distributed systems to create data pipelines and data products. They’re using existing technologies like Apache Spark, Apache Hadoop, and Apache Kafka.
Creators of Distributed System Frameworks
Creators of distributed system frameworks are the people who create new distributed systems or improve existing distributed systems frameworks. They’re creating everything themselves. These include writing schedulers, resource managers, and harnesses.
Confused Teams
Sometimes teams get confused on their core competencies. An end data product team will think they’re users of distributed system frameworks. A team that uses existing distributed systems frameworks thinks they can create their own distributed system. All of these scenarios will lead to failure.
I’ve written about the increase in complexity when using Big Data. An end product team will experience a 10x increase in complexity when trying to use a Big Data framework. For most teams, this will lead to failure. They’ll need more guidance and mentoring to get through their Big Data journey.
That leads me to somewhat common issue — teams that think they can create their own distributed system. There is all sorts of failure wrapped up in creating your own distributed system. This mostly stems from the fact that you’re probably not a distributed systems engineer. There are very few people with the computer science, system design, and operational understanding to create a distributed system from scratch.
Creating your own distributed system may sound like a good idea initially. We’ll write our own that does exactly what we want. Except:
- You will have to spend the time to write it
- Debugging and testing a distributed system is tough
- There are so many unknown unknowns that only time and usage reveals
- The operations team won’t be able to leverage existing knowledge
- Any operational issue will be escalated to the development team
- The development team will spend their time debugging their distributed system instead of creating new features
Do yourself and your team a favor. Take an honest look at your abilities before going down one of these routes. This will save you all kinds of time, money, and heartache. Using the wrong team for the job is always a bad idea.