As the VP of Data & Analytics Andela, I spend a lot of time working closely with our data team to build interpretable predictive models that capture customer behavior. These models inform key decisions around how to better optimize customer interactions that ultimately drive the business forward.
You probably already know that building a successful predictive model is a multi-pronged process that requires tapping into historical data as an organization, leveraging the expertise of individual contributors and external data sources, processing day-to-day requests from executive leadership teams, and conducting original research. Perhaps the most exciting part about this process is the original research, because of the fountain of untapped insight it can provide.
In my endeavors conducting original research, I have spoken to hundreds of data engineers, data scientists, and data executives at the vice president and C-suite level. Their feedback has helped me uncover three common pitfalls preventing many teams from gaining value out of their data, which I will discuss here, as well as recommended solutions to solve for these common mistakes.
Pitfall #1: Strictly adhering to the Inmon or Kimball data warehouse model
Why it’s a problem
Many data engineers today staunchly adhere to the Inmon or Kimball approach when creating a data warehouse. Neither method is “wrong,” but neither has been refreshed to highlight the emphasis and importance of obtaining business insights that are necessary today.
Insisting that data fall into one of these rigid frameworks ultimately comes at a disservice to most organizations as it can result in projects taking a long time to implement and lacking the agility required to stay competitive.
The focus should turn to a more modern data architecture approach, where fast iterations and stakeholders are prioritized over a model etched in stone. One strong example of this is leveraging a data lake architecture which enables you to build a model with queryable data (both structured and unstructured) that’s centralized in one location. This modernized approach makes it easier to generate the quick insights required to stay ahead in today’s fast-paced business environment.
Pitfall # 2: Insisting you’re solely data-driven
Why it’s a problem
Recruiters and hiring managers are obsessed with the term “data-driven.” That buzzword drives so many conversations forward, but it’s ultimately misleading, and can actually cause distrust between stakeholders and board members/investors.
At the end of the day, data itself can’t make decisions or predictions — it’s up to the people interpreting the data to do that. Data is imperfect, and we as data professionals need to respect its imperfection. We can’t just point to a series of graphs and expect them to serve as a blanket explanation driving all of the business decisions. Without context, they’re meaningless.
That’s why it’s important to remember that the narrative surrounding data is equally as important as the data itself. Without the story to go along with your numbers, you aren’t going to have a compelling use case to drive change in your organization.
Instead of calling your organization data-driven, think of it as data-informed. Data teams need to be agile and respond accordingly to incoming data, but they also need to build solid relationships with stakeholders and build trust and consensus to drive change. Crafting a compelling narrative to go along with this data is a critical step in the consensus-building process, so data teams should treat this as a higher priority.
Pitfall # 3: Not looking at the full picture when making data-informed decisions
Why it’s a problem
Too often, I see data teams basing a disproportionate amount of their decisions on insights derived from well-correlated but simplified machine learning problems at the expense of gathering more nuanced and accurate answers around problems they are trying to solve. For example, a data scientist basing a Customer LTV or customer segmentation model solely around customer responsiveness to Google Ads would be doing themselves a disservice. Yes, this would be a clean and simplified model, but it wouldn’t show the complete picture.
Ultimately, this same data scientist trying to build an accurate customer LTV model would be much better served evaluating historic customer information in tandem with data from external sources targeting similar industries and user demographics. While it’s true that this will inevitably lead to a messier data model, it’ll ultimately also be more accurate and beneficial to the organization.
The truth? No data structure or model is perfect
Data is constantly evolving, and there isn’t a single perfect data model or structure. The key is figuring out when to use a specific approach. Settling for just one strategy may have gotten you where you are, but it won’t get you where you want to be.
So if I have any closing words of wisdom for you, it is to embrace a more holistic, data-informed business model to gain a fully accurate picture as to what is taking place in your business and with your customers. It will not only help you deal with the issues I discussed above but also enable you to better identify future business hurdles, as well as effective strategies to overcome them.