Design Thinking and Data Science

Design Thinking Improves Your Data Science

New ways to understand your problem and increase your project’s impact

Design Thinking helps you organize thoughts to create optimal solutions.
At the beginning of a project we, the data scientists, ask questions about the problem statement and the data. The stakeholders usually respond with some concrete (or not) answers about their problem. This usually happens for a couple of meetings. We take the requirements, firmly in hand, and begin wrangling data, building models, etc.

But… What if the problem statement you are working on is not the real problem you are asked to solve? What if, in the end, and after a lot of work, you show the stakeholders your results, and they are underwhelmed or, even worse, not using the results? In this article, I will be covering some ways to avoid these outcomes using Design Thinking.

How Does This Impact a Data Scientist?

One of the biggest mistakes that a data scientist can make is jumping straight to the development of the project and not spending enough time understanding the true goals that have the biggest impact.

Sometimes, these goals are easy to identify, and complications are easy to see coming. This is usually true if the data scientist is familiar with the problem space and/or is integrated into the business processes.

Many data scientists come in as a “third party” to help drive the strategic goals of the group. This means that there is a high risk of misunderstanding the problem and creating a solution that doesn’t live up to stakeholder expectations. Or worse — creating the wrong solution. If you’re involved in these types of projects, then this article is for you. There is a framework that we can use to help negate this risk before even starting the data science work. This framework is Design Thinking.

Design Thinking helps you organize thoughts to create optimal solutions.

What is Design Thinking?

According to the Interaction Design Foundation — “Design thinking is an iterative process in which you seek to understand your users, challenge assumptions, redefine problems and create innovative solutions which you can prototype and test.”

This enables us as data scientists to:

  • Figure out what is the real problem we are trying to solve
  • Understand if we even need a model
  • Make sure what is in our stakeholder’s head is what we are going to provide
  • Know how the model will impact the users/business

There are several formal Design Thinking based processes out there. You can choose to follow them if you wish (like a design sprint). However, Design Thinking to me is not a formal process but also a framework of how to think and how to build an understanding of who/what you are working with. This means that the design thinking structure can change depending on the audience, business, and problem.

General Phases of Design Thinking

Usually, Design Thinking is considered an iterative path of empathizing, defining, ideating, prototyping, and testing. Here the emphasis is on the word iterative. Although the steps I am reviewing are shown linearly, each step can be performed again as needed. Many times, I repeat steps to gather more information and further refine my understanding of the problem, user, or stakeholder.

The general definition of each step is:

  • Empathize: Build up an understanding of your customer and the problems they are facing.
  • Define: Define the problem(s) that need to be solved.
  • Ideate: Think about all the different ways that the problem can be solved. Any idea can be a good idea. Afterward, narrow down to the best idea(s).
  • Prototype: Develop the simplest possible working model of what you would like to build.
  • Test: Test the idea with your stakeholders. Get their feedback.

Let’s review how you can use each step to start your next project.


As data scientists, our first instinct is to begin to understand the data we are going to use to solve our problems. However, we need to understand what is beyond the data to the people involved in this problem. We can have all of the data in the world but if we do not know how users or stakeholders interact with the product and understand it in their terms, we cannot possibly make a solution that is going to fully solve their problem. My favorite saying here is that we need to “fall in love with the problem”. Falling in love with the problem allows us to gain empathy for what our customers/stakeholders are going through.

There are a couple of different ways that we can gain empathy:


  • Spend an hour (or more) in 1–1 meetings with stakeholders.
  • Learn the language, jargon, and methods that your stakeholders use today and where they see the gap.
  • Learn about how they are currently doing the work today or what they currently see.
  • Gather their ideas of how they would want to use the product if they had a magic wand to make everything better.

Product Exploration

  • Get to know the product you are working with if a version of it exists today
  • Get a user account in a test environment or a read-only account in a production environment to test out the product and become familiar with how a new user would interact with the product.
  • Use your understanding of data to see how your solution will impact the customers of your solution.
  • As a data scientist, you have a very useful lens to identify additional possible improvements or opportunities that others might not see.

Discussions with Leadership, Product Managers, and Team Leads

  • Inquire how leadership is using the product/data/system today to drive their business goals (if they are). Identify how the solution will or could contribute to those outcomes. Note: If the solution won’t contribute to the business goals, is it something that the stakeholders want or need?
  • If leadership is not using the outcome to drive any business goals, figure out what requirements there are to add model metrics to the process. Note: It might be possible to create a new metric that contributes to the overall goals of the business.


Once the context and understanding are gathered, we need to distill it all down into the problem(s) that need to be solved. The define phase should be done with your stakeholders and it will encourage productive conversations about the problems that need to be addressed. These conversations also ensure alignment on what the problem means to both customers/stakeholders and data scientists.

Turn fuzzy user feedback into discrete problems in the “Define” phase.

The most important aspects of this are:

  1. Define the problem in the terms and language that your stakeholders use
  2. Use the knowledge from the interviews to inform what problem statements should be generated

Now, there is a risk that the problem statements might morph into vision statements. The way to mitigate this is to ensure that your defined problems are manageable and actionable. This is especially important if you are working with people who are used to setting the vision for the group as that vision-setting might come most naturally to them.

The opposite can be true as well — a problem statement can get locked into a framework of the problems of today. This can happen if your audience is composed of those that are in the weeds and doing the work on a day-to-day basis. It is important to make sure the problem statement is a stretch goal and that we do not have to consider the frameworks, limitations, or structures of today. They might not be useful or even exist after the project is complete.

There are several ways to define a problem statement, but the two I use most are “How Might We” (HMW) statements and the use of the five W’s, which are discussed below.

How Might We “HMW” Statements

My favorite way to approach generating problem statements is to use HMW statements. In this process, everyone writes down problem statements starting with the phrase “How might we…”. They are usually generated individually and voted on by the group to obtain the best problem statement. HMW statements are written positively to make sure how we remember how the user should feel.

Some examples are:

  • How might we… redesign the car buying experience so people know how much their car is worth anytime?”
  • How might we… make users feel confident their data is secured?”

Make sure that the HMW statements do not suggest an outcome — the goal is not to figure out how to solve the problem at this point but identify what the problem is. Here are some examples of HMW statements rewritten from the previous examples that suggest an outcome:

  • How might we… build a model to predict a car value from a car metadata?”
  • How might we… create a data locking system to secure our data?”

See how easy it is to slip into suggesting an outcome? We all have ideas, but we want to make sure we are aligned on the problem is before we start thinking of how we are going to solve a problem. Sometimes, you may find that the problem you were asked to solve can be solved with something other than a model.

Now, the goal of this process is to come up with one or many approaches. The people involved also need to understand this path is iterative and that the group is on a journey to find the right problem and the right solution — knowing that ideas and requirements can morph and change just as the business does.

The Five W’s

If you are tight for time, the five W’s allow for a quick breakdown of problem statements into a single discrete problem statement. The five W’s are Who, What, When, Where, and Why. In this case, ask yourself and the team (probably with some debate):

  • Who is going to use the outcome?
  • What do they need to do with it?
  • When are they going to use it? (At what point in their process?)
  • Where and what platform will they use it?
  • Why would they use this outcome? (Is it important?)

Once you have an HMW statement or answered the five W’s, you will have at least one problem statement that the team can tackle. After voting on and selecting the most important problem statement, the next step is to think of all of the ways that the problem can be solved. This is the idea generation step of the Design Thinking process called ideation.


Now that you have our problem statement, you will need to think about the different ways the problem can be solved. In this step, any idea is a good idea, focusing on quantity over quality. Make sure to do this step with your stakeholder if possible, so you can receive immediate feedback on the ideas. Use this as a time to merge the technical speak into a non-technical story.

When you are ideating, try and discuss similar solutions that have solved similar problem statements before. This could be in the same industry you are operating in or different industries. These types of conversations can beget even more interesting ideas and inspirational ways to combine existing or new data with problem statements. For example, if you’re trying to figure out the optimal place for truckers to stop to sleep, could you build a model like Airbnb to allow drivers to stay in their homes?

Using a whiteboard, PowerPoint, or Google Sheets to keep track of ideas can be beneficial. Come up with as many ideas as possible and prioritize them based on complexity, time to complete, and impact. For this, I like to keep in mind the impact to effort matrix from the Lightning Decision Jam.

Impact to Effort Matrix from the Lightning Decision Jam


Once you have your list of ideas of how you would solve the problem, the next step is to build a prototype.

Come up with a couple of prototypes that are quick to build.

Now, we are not talking about something that is a fully functional application. The goal here is to build something that looks like it would work but embraces the “fake it ’til you make it” mentality.

Most importantly, this outcome is NOT a model design. It should be the output of whatever model you are looking to build. In the end, your data science solution is probably going to end up on some sort of visualization — this could be a dashboard, a notification in an app, or an API response. In this case, we are looking to build out what the end result could look like, not an actual fully functional outcome.

This prototype could be a JSON file of what the model would return, a PowerPoint of how the app could look, or a hand-drawn picture of the graph that the model would generate. There is no “best” way to build the prototype. I usually go with the approach that allows me to create a couple of realistic-looking prototypes in a few hours.


Once you have one or more prototypes, test them and collect feedback about the design by showing the stakeholders. This is an important time for feedback. If what you’re looking to create doesn’t meet with the approval of the stakeholder, this is usually the time that any discrepancy comes out. It is also the time for feedback and alignment. Anything that the stakeholder notices are off or could be improved should be added to the requirements for the final model.

Some important considerations when collecting feedback about the prototype are:

Answer Questions without Leading the Feedback

Try not to directly answer a question like “what is this metric supposed to be?” Instead of directly answering with “that’s the number of reservations made per hour”, ask what the user stakeholder or user thinks it is.

For example, respond to “What is this metric supposed to be?” with “What do you expect this metric to mean if anything?”

This gives you greater insight into what is driving the questions. Could it be that the feature is not clear enough or is it that an outcome is not useful or as expected?

Silence is OK

Be OK with silence — let the stakeholder or user play with or study the result. Try not to guide the user through your prototype. If your solution is supposed to be self-explanatory, then the user should hopefully be able to figure out what the solution is telling them. It’s OK if there are quiet spells as a user studies the prototype.

After the Test

After we get feedback about the simple prototype, that is when the real work begins.

With a clear outcome in mind, the assurance that you can think in terms of the customer, and the confidence that you will provide a solution they will value, you can start building the real solution.

The benefit of the Design Thinking process is that you can have a clear outcome of what needs to be done. In addition, your stakeholders feel connected to your process. They have the confidence that you are building something that will benefit them. They trust that you understand their point of view, where their opportunities are, and that you can identify where additional opportunities might be.