How good a Big Data strategy can be defined by someone that doesn’t know the technology behind it?

Today’s blog post comes from a question from a subscriber to my mailing list. The question come from André M:

How good a Big Data strategy can be defined by someone that doesn’t know the technology behind it? (By knowing I don’t mean being able to configure it, but at least knowing its features, context, if it’s growing in market share, how it can be integrated with other tools).

Great that you realized that Big Data is different from small data strategy. If you go into a Big Data strategy with a small data mindset, you will fail and that’s a common situation for failure.

I had a company reach out for an architecture review on their project. The manager had put their architecture together and didn’t know about Big Data technologies. He had researched on the internet and had asked coworkers what technologies to use.

He showed me their architecture diagram. It was generally wired together correctly, but was really devoid of any understanding of why or how.

I started asking him why he chose one technology or another. He didn’t know why, he would just that’s what he’d read or had been recommended. I asked about the tradeoffs of the technology versus another; once again he didn’t know. I asked him if he really understood the use cases to choose the right technology or if he chose the technology first and would hope the use cases would work. He chose the technologies first and would have to hope things would work.

The diagram he showed me was incredibly advanced. Only a veteran data engineering team could accomplish that and after much development time. I started asking him about the team and its makeup. It was very obvious that the team had a very, very low probability of success. They had never done distributed systems or Java for that matter.

I tell this story because that diagram existed as slideware and wouldn’t ever come to fruition. It was something he presented to myself and other higher ups. He had no real understanding of how difficult the project would be or how low of a probability his team had of actually accomplishing it. I always talk about qualified Data Engineers. These are the people with the technical understanding to make a project like this have a high probability of success.

To answer your question more directly, unless a problem is very, very simple, I think a person needs to have an understanding of Big Data technologies.

Ideally, this team would have created the project and diagram a different way. The manager would have attended my Business of Big Data management class and the engineers would have attended my Professional Data Engineering class. Both sides would have a much better idea of what’s possible and what’s difficult.

At other companies where they’ve taken this route, they are more successful. The manager would have focused on the use cases and business value for the project. The Data Engineers would have focused on choosing the right technologies for the use case. The manager isn’t promising something impossible and the engineers are equipped with the knowledge to be successful. That’s how successful Big Data projects are created.