Five dysfunctions of ‘democratised’ research. Part 4 – Quantitative fallacies

This is the fourth in a series of posts examining some of the most common and most problematic problems we need to consider when looking to scale research in organisations. You can start with the first post in this series here.

Here are five common dysfunctions that we are contending with.

  1. Teams are incentivised to move quickly and ship, care less about reliable and valid research
  2. Researching within our silos leads to false positives
  3. Research as a weapon (validate or die)
  4. Quantitative fallacies
  5. Stunted capability

In this post, we’re looking at what happens when research is ‘weaponised’ in teams.

Dysfunction #4 –  Quantitative fallacies

I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever the matter may be. – Lord Kelvin

I fear many people feel like Lord Kelvin.

There seems to be an intuition that knowledge unable to be expressed numerically is less satisfactory than knowledge that fits into a graph. In order for an assertion to be worth considering as serious, it must have a number associated with it. Everything else is anecdote.

Perhaps we have inherited this from finance. Finance are, after all, the masters of presenting future fictions with bold numbers and graphs. Finance whose authority appears to be rarely challenged.

Organisations love quantitative research because it is fast and feels definitive.

Smash out a survey, launch an experiment, categorise customer feedback by keyword, look at the product analytics. Somehow, numbers just feel more reliable. More trustworthy.

The McNamara fallacy (also known as quantitative fallacy), named for Robert McNamara, the US Scretary of Defense from 1961 to 1968, involves making a decision based solely on quantitative observations (or metrics) and ignoring all others. The reason given is often that these other observations cannot be proven.

–  Daniel Yankelovich “Corporate Priorities: A continuing study of the new demands on business.” (1972)

In my experience, presenting a number boldly is much less likely to be challenged than any assertion backed up by more qualitative evidence. Yet surprisingly few people seem to be inclined (or able) to ensure that the work done to establish that number has any rigour.

Take surveys. How many organisations the the time to do cognitive interviewing to ensure that the data collected in the survey is valid and reliable? Very few. Most don’t know it is even something you should do, and the others don’t want to spend the time.

Do we just have blind faith that our survey respondents will make sense of the questions the same way as us? Or do we actually not really care so much about the validity? We just want an answer. A definitive sounding answer. Some data to show that we are evidenced based.

How many teams when A/B testing their two versions of the design using unmoderated research watch the videos to make sure that people did really complete the task in a way that could be considered an adequate user experience? To check that the people who undertook the research have any resemblance to who they said they were in the screener? To ensure that they things they say and the scores they give make sense when compared to the experience they actually had?

All sounds a bit time consuming doesn’t it, when all you really want is data to tell you what to do. To take the decision out of your hands.

We’ve managed to convince ourselves with a large enough volume of respondents, these problems go away. But the fact is, these numbers can easily be completely misleading. People don’t understand the survey question and answer anyway. To get the incentive. To find out what other questions you’re asking, because some of us are completists.

Recently my team recently did some survey testing – we were testing a feature prioritisation survey (not my favourite). We observed people who told us they didn’t understand what a features as described in the survey. Regardless, it sounded cool and they then went on to prioritise it highly against other features in the survey regardless.

How often does this happen? No one knows.

The first step is to measure whatever can be easily measured. This is OK as far as it goes.

The second step is to disregard that which can’t be easily measured or to give it an arbitrary quantitative value. This is artificial and misleading.

The third step is to presume that what can’t be measured easily really isn’t important. This is blindness. The fourth step is to say that what can’t be easily measured really doesn’t exist.

This is suicide.

–  Daniel Yankelovich “Corporate Priorities: A continuing study of the new demands on business.” (1972)

There are multiple, related quantitative fallacies.

Some like McNamara and Lord Kelvin, believe that quantitative data is superior. But others are more complex – they trust that the trade off for speed and convenience does not have a dangerous impact to validity and reliability. Other fallacies result from absence of experience and ability in defending qualitative data and critiquing quantitative methods.

The fastest and most ‘definitive’ sounding methodologies (and the tools that enable them) have never been more popular. While it is encouraging that more and more people are keen to take a more human centred approach to product design,  experienced researchers need to intervene to make sure that these methods are being used, and critiqued, appropriately.

We need to ensure that our organisations don’t over index to the rapid, quantitative methods because they play well with senior leadership. And when we do use these methods , we need to ensure that we maintain a high enough quality standard that we can genuinely stand behind the numbers and believe they have some reliability and validity.

You can read about the next dysfunction here.

Five dysfunctions of ‘democratised’ research. Part 3 – Research as a weapon

This is the third in a series of posts examining some of the most common and most problematic problems we need to consider when looking to scale research in organisations. You can start with the first post in this series here.

Here are five common dysfunctions that we are contending with.

  1. Teams are incentivised to move quickly and ship, care less about reliable and valid research
  2. Researching within our silos leads to false positives
  3. Research as a weapon (validate or die)
  4. Quantitative fallacies
  5. Stunted capability

In this post, we’re looking at what happens when research is ‘weaponised’ in teams.

Dysfunction #3 – Research as a weapon (validate or die)

Over reliance on research, without care to the quality level of the research, can also be a symptom of another problem in our organisations – lack of trust between disciplines in a cross functional team.

In particular the relationship between design and product management can have a substantial impact on the way that research is used in product teams. If the relationship is strong, aligned and productive research is often used to support real learning in team. But where the relationship is less healthy, it is not uncommon to see research emerge as a form of weaponry. 

comic about how relationship has declined because partner graphs everything
©XKCD

Winning wars with research

How does research become weaponry? When it is being used primarily for the purpose of winning the argument in the team.

Using research as evidence for decision making is good practice, but as we have observed in earlier dysfunctions, the framing of the research is crucial to ensuring that the evidence is reliable and valid. Research that is being done to ‘prove’ or ‘validate’ can often have the same risk of false positives that comes from the silo dysfunction.

This is because the research will often be too tightly focussed on the solution in question and there is little or no interest from the team around the broader context. This lack of realistic context can result in teams believing that solutions are more successful than they will ultimately turn out to be in the realistic context of use.

Data as a crutch for design communications

Another reason to see research being used as weaponry is to compensate for a lack of confidence or ability in discussing the design decisions that have been made. Jen Vandagriff, who I’m very fortunate to work with at Atlassian, refers to this as having a ‘Leaky Design Gut’.

Here we see research ‘data’ being used instead of (not as well as) the designer being able to explain why they have made the design decisions they have made. Much as I love research, it is foolish to believe that every design decision needs to be evidenced with primary research conducted specifically for this purpose. Much is already known about design decisions that can enhance or detract from the usability of a system, for example. 

In a team where the designer is able to articulate the rationale and objectives for their design decisions, and there is trust and respect amongst team members, the need to ‘test and prove’ every decision is reduced.

Validation can stunt learning

Feeling the need to ‘prove’ every design decision quickly leads to a  validation mindset – thinking, ‘I must demonstrate that what I am proposing is the right thing, the best thing. I must win arguments in my team with ‘data”. .

Before going straight to research as validation’, it is worth considering whether supporting designers to grow on their ability to be more deliberate in how they make and communicate their design decisions could be a more efficient way to resolve this challenge.

Sometimes it is entirely the right thing to run research to help understand whether a proposed approach is successful or not. The challenge is to ensure that we avoid our other dysfunctions as we do this research. And to make sure that this doesn’t become the primary role of research in the team – to validate and settle arguments. Rather, it should be part of a ‘balanced diet’ of research in the team.

If we focus entirely on validation and ‘proof’, we risk moving away from a learning, discovery mindset. We prefer the leanest and apparently definitive practices. A/B testing prototypes and the creation of scorecards are common outputs that result from this mindset. We’re incentivised to ignore any flaws in the validity of the method if we’re able to generate data that proves our point. 

Alignment over evidence

Often this behaviour comes from a good place. A place where teams are frustrated with constant wheel spinning based on everyone having an opinion. Where the team is trying to move away from opinion based decision making, where either the loudest voice always wins or the team feels frustrated by their inability to make decisions to move forward. Using research as a method to address these frustration does make sense and should be encouraged.

Validation research can provide short term results to help move teams forward, but it can reinforce a combative relationship between designers and product managers. Often this relationship comes from a lack of alignment around the real problems that the team are setting out to solve. Investing more in more ‘discovery’ research, done collaboratively, as a ‘team sport’ can be incredibly powerful in helping create a shared purpose across the team that can help promote a more constructive and supporting teamwork environment.

Support from an experienced researcher with sufficient seniority can help the team avoid the common pitfalls of seeking the fastest and most definitive ‘result’, but to achieve a shared understanding of both the problem and the preferred solution. Here the practice of research, done collaborative as a team, can help not only to inform the situation to achieve more confident decision making, but also to heal some tensions in the team, by bringing the team together around a shared purpose – solving real problems for their customers or users.

You can read about the fourth dysfunction here.

Five dysfunctions of ‘democratised’ research. Part 2 – Researching in our silos leads to false positives

This is the second in a series of posts examining some of the systemic problems that organisations tend to rub up against as they seek to ‘scale’ research activity in their organisation. We are looking particularly at ‘dysfunctions’ that can result in at best, ineffective work and at worst, misleading and risky outcomes. You can start with the first post in this series here.

Here are five common dysfunctions that we are contending with.

  1. Teams are incentivised to move quickly and ship, care less about reliable and valid research
  2. Researching within our silos leads to false positives
  3. Research as a weapon (validate or die)
  4. Quantitative fallacies
  5. Stunted capability

In this post, we’re looking at the impact of our organisation structure on research outcomes.

Dysfunction #2 – Researching within our silos leads to false positives

Always design a thing by considering it in its next larger context – a chair in a room, a room in a house, a house in an environment, an environment in a city plan. – Eliel Saarinen

The larger the organisation, the more fragmentation and dependencies you tend to get across teams. Teams are organised by product or platform, and then often by the feature set they work on. Occasionally teams are organised by a user type, and very rarely you find some arranged by user journey.

Even in this complex ecosystem of teams where dependencies are rife, the desire for autonomy in teams remains. Between teams, we tend to seek to avoid reliance other teams where possible. We don’t want our own team velocity or ability to ship to be decreased by anyone else. In this environment, collaboration between teams tough. It can be hard to coordinate, there’s no incentive to take this time and trouble. And this leads to greater focus, which, in theory is great, except….

Beware the Query Effect

When it comes to research, we know how critical getting the right research question is. Getting the ‘framing’ of the research right is crucial because, as the Query Effect tells us (and as we know from our own personal experience) you can ask people any question you like and you’ll very likely get data in return.

Whenever you do ask users for their opinions, watch out for the query effect:

People can make up an opinion about anything, and they’ll do so if asked. You can thus get users to comment at great length about something that doesn’t matter, and which they wouldn’t have given a second thought to if left to their own devices. – Jakob Nielsen

By focussing our research around the specific thing our team is responsible for, we increase our vulnerability to the query effect.  That little feature is everything to our product team and we want to understand everything our users might think or feel about it, but are we perhaps less inclined to question our team’s own existence in our research?

Researchers are encouraged to keep the focus tight, to not concern themselves with questions or context that the team cannot control or influence.

I like to use this visual illustration of what that is problematic. Take a quick look at the image below. What strange sea creature do we have here do you think? Looks quite scary, right?

Scary looking shadow in water

Oh but wait, when you pull back just a little more you realise the story is completely different, and all we have here is a little duck, off for a swim, nothing to worry us at all.

Duck swimming in water with shadow (no longer scary) below

How often is our research so tightly framed on the feature our team is interested in that we make this mistake?

We think something is important when in actually, in proper context of the real user need, it is not so important at all? Or conversely, we focus so tightly on something we think is important when what our users care about is just out of frame. Just outside the questions we are asking, that they are so busy now, helpfully answering. Even though it is not the important thing.

I fear this is one of the most common dysfunctions that we see in product teams doing research in the absence of people who are sufficiently experienced and with seniority and confidence to encourage teams to reshape their thinking.

What is the risk?

Research that is focussed too tightly on a product or a feature increases the risk of a false positive result. A false positive is a research result which wrongly indicates that a particular condition or attribute is present.

False positives are problematic for at least two reasons. Firstly they can lead teams to believe that there is a greater success rate or demand for the product or feature they are researching than is actually the case when experienced in a more realistic context. And secondly, they can lead to a lack of trust in research – teams are frustrated because they have done all this research and it didn’t help them to succeed. This is not a good outcome for anyone.

The role of the trained and experienced researcher is to not only have expertise in methodology but also to help guide teams to set focus at the right level, to avoid misleading ourselves with data. To ensure we not only gather data, but we are confident we are gathering data on the things that really matter. Even if that requires us to do research on things our team doesn’t own and cannot fix or to collaborate with others in our organisation. In many cases, the additional scope and effort can be essential to achieving a valid outcome from research that teams can trust to use to move forward.

You can read about the third dysfunction here.

Five dysfunctions of ‘democratised’ research. Part 1 – Speed trumps validity

The good news is that more and more organisations are embracing research in product teams. Whether it is product managers doing customer interviews or designers doing usability tests, and everything in between – it is now fairly simple to come up with a compelling argument that research is a thing we should probably be doing.

So we move on to the second generation question. How do we scale this user centred behaviour?

Depending on where in the world you are – and your access to resources – your answer is usually to hire more researchers and/or to have other people in the team (often designers and product managers) to do the research. This is often known as ‘democratising research’.

Almost certainly this is the time that an organisation starts looking to hire designers and product managers with a ‘background in research’ and to establish some research training programs, interview and report templates and common ways of working.

This all sounds eminently sensible, but there are some fairly structural issues in how we work that can undermine our best intentions. At best, it can render our research wasteful and inefficient, and at worst it can introduce significant risks in the decision making that our teams make.

Each of these are systemic issues and anyone doing research is likely to be impacted when working as part of a cross functional product team.

So, let’s assume that people doing research have had adequate training on basic generative and evaluative research methods – here are five common dysfunctions that we will need to contend with.

  1. Teams are incentivised to move quickly and ship, care less about reliable and valid research
  2. Researching within our silos leads to false positives
  3. Research as a weapon (validate or die)
  4. Quantitative fallacies
  5. Stunted capability

Here we will start with the first, which is one that many will find familiar.

Dysfunction #1.
Teams are incentivised to move quickly and ship, care less about reliable and valid research

The most popular research tools are not the ones that promise the most reliable or valid research outcomes, but those that promise the fastest turnaround. One well known solution promises:

Make high-confidence decisions based on real customer insight, without delaying the project. You don’t have to be a trained researcher, and there’s no need to watch hours of video.

It sounds so appealing and it is a promise that a lot of teams want to buy. Speed to ship or velocity is often a key performance indicator for teams. It’s not a coincidence that people usually start with ‘build’ and rushing to MVP when talking about the ‘learn, build, measure’ cycle.

Recruitment trades offs made for speed

The challenge is that doing research at pace requires us to trade off characteristics  are important to the reliability and validity of research.

One of the most time consuming aspects of research is to recruit participants who represent the different attributes that are important for understanding user needs the product seeks to meet. The validity of the research is constrained by the quality of the participant recruitment.

What do we mean by validity? In the simplest terms, it is the measure of how well our research understands what we intend for it to understand.

Most of the speedy research methods – whether that’s guerrilla research at the coffee shop or using an online tool – tend to compromise on participant recruitment. Either you just take whoever you can get from the coffee shop that morning, or you recruit from a panel of participants online and trust that they are who they say they are and that they won’t just tell you nice things so you don’t give them a low star rating and they get to keep this income source.

There are many kinds of shortcuts to be taken around recruiting – diversity of participants, ‘realistic-ness’ of participants or number of participants being a few. Expect to see some or all of these short cuts in operation in product teams where speed to ship is the primary goal.

Being fast and scrappy can be a great way to do some research work, but in many teams the only kind of research they are doing is whatever is fastest. This is like eating McDonalds for every meal because you’re optimising for speed… and we all know how that works out.

Teams are trading off research validity for speed every day. Everyone in the organisation understands the value of getting something shipped, and this is often measured and rewarded. Not so many people understand risks associated with making speed related trade offs in research.

What is the risk?

Misleading insights from the research work can send a team in the wrong direction. That can direct a team to spend time creating and shipping work that does not improve their users experience or meet their users needs. That does not increase the desirability or necessity of their product, and thereby negatively impacts their productivity and the profitability of their organisation.

Does this mean that speed to ship is bad? Should all research be of an ‘academic standard’?

No.

Testing to identify some of the larger usability issues can often be done with small participant numbers and less care to find ‘realistic’ respondents. But if the work that results from your research findings is going to take more than one person more than a week to implement, it might be worth increasing the robustness of your research methodology to increase confidence that this effort is well spent.

People doing research need to be clear with their teams about the level of confidence they have in the research findings (it is fine for some research to result in hunches rather than certainty as long as it is clearly communicated). And teams should plan to ensure they are using a healthy diet of  both fast and more robust research approaches.

Organisations need to ensure they have someone sufficiently senior asking questions (and understanding how to critique the answers)  about not just the existence of data from user research but also looking under the hood to evaluate the trade offs being made, and as a result the level of confidence and trust we should place in the insights and claims made.

You can read about the second dysfunction here.