In my previous post, I described some of the root causes I believe cause common problems for data science teams. So what do we do?

The problems facing data science are largely a function of the newness of data science groups to companies and other large organizations. Software development was once in a similar position. One way they addresses these problems was with the Agile approach. I believe we can borrow from this to address the challenges data science faces today.

For our purposes, I will define Agile as an iterative process focused on delivering maximum value to the end user. The end user for a data scientist is anyone who consumes the result of what we do.

As data scientists, we’re familar with the idea of iteration - we often go through an iterative process during analysis. The essence of Agile is to put the end user more centrally within our iterations.

There are three specific Agile practices that can help address the problems data science groups face:

  • User Stories
  • Vertical Slicing
  • Stakeholder Reviews

User Stories

User stories are a way to decribing the work that puts the focus on the value to the end user. It encourages conversations between data scientists and stakeholders that help develop a shared understanding of what the impact will be.

It is natural for us as data scientists to describe our work in terms of what we plan to do. User stories give us a framework for shifting to why. The diagram below shows this contrast for an example from the TransLoc data science team. On the left is the initial way we would have described the work as data scientists. On the right is the user story for this work. It follows the traditional form of user stories which is:

As a ____, I need to ____, so that I can ____.

alt text

In my previous post I described one of the problems for data science groups as marginality, or the perception that what we do is tangential to the core work of the organization. The practice of user stories can address this by helping us make sure that we are addressing critical needs, and that people understand that this is the intent of our work.

An important point about user stories is that they are not just elaborate wordsmithing. In addition to reframing our communication and conversations with others, they also help reorient our thinking as data scientists. We may have envisioned a technical path to solving a problem, and never questioned our choices. User stories force us to think about the value we are trying to deliver and consider the range of ways we might achieve this. Often there may be simpler approaches that could have the same impact on the end user.

Vertical Slicing

Another incredibly valuable Agile concept is vertical slicing, which allows us to get feedback from the end user early and often. It also enables us to “maximize the work not done”, by apply our resources in small increments, and only continuing to work on projects that continue to deliver value. If you’ve ever done a lot of work to create a great result that no one used, this is the approach for you!

The traditional workflow is like a layer cake: we build a project layer by layer, and when it’s done and frosted, we deliver the results. Vertical slicing focuses on changing this in two ways:

  1. Vertical: tackle the work in a way that gets all the way from the bottom layer to the top, so there is actual value to the end user produced
  2. Slice: narrow the scope to the least amount of work that can deliver some value

The mantra is find the thinnest vertical slice. The end result of this practice is that we can get feedback with as little work as possible, and determine whether to continue with the next vertical slice or devote resources elsewhere. It can be surprising how much more information we get about what’s valuable from putting a tangible output in front of users instead of just talking about what we’re planning to produce. This combats the problem of mystery by providing transparency into what data scientists produce in way that is easy for people to understand, because it’s oriented around what they get from us as an end user.

Stakeholder Reviews

The final Agile practice that brings together user stories and vertical slicing is stakeholder reviews. This means getting all the stakeholders of the data science group together to review the vertical slices that have been completed, and give input into the prioritization of what to do next. There are three main steps to a stakeholder review:

  1. Show the work that has been completed (and why it was done that way)
  2. Propose the work to do next in terms of user stories and vertical slices (and why these are the next priorities)
  3. Have a conversation about 1 and 2, and consider reshuffling priorities accordingly

Stakeholder reviews are best done on a regular cadence (at TransLoc we’re doing these for data science once a month) - this creates momentum and stakeholder engagement as they understand data science impact more and more over time. Stakeholder reviews address the problem of misalignment by exposing all stakeholders to the variety of ways data science work is having and impact, and by encouraging various stakeholders to talk to each other to resolve competing priorities for using data science resources.

Changing the Conversation

The challenges we face at this early stage of establishing data science groups will not be solved overnight, but I believe these Agile practices are one approach to helping us dissolve these barriers over time. By focusing on the continuous delivery of value to end users, we draw the stakeholders in our organization in, and in the process learn more about how we can maximize our impact.

alt text