While handling, (re)viewing, and digging into data we’ve come to ask ourselves: Does user research data validity expire? While it’s quite easy to work with recent data or data we’ve gathered ourselves – it’s rather hard to work with older resources. So, what can we do to keep research data actionable?
“User research data is like milk, the longer it stands, the more likely it will go sour.” A few months ago at the great MuC Conference 2020, we had a Barcamp where this metaphor came up.* The quote shows that it’s not clear how well user research data ages.
Though milk is a beautiful metaphor, we partly disagree with it. User research data does not necessarily expire. Hence, in the following, we will outline the dimensions of user research data validity. Let’s see on which factors validity and preservability of data depend.
*(Unfortunately, in the session nobody knew the author – so feel free to send us a reference to give the proper credits here.)
What is user research data validity?
The three types of user research data
First, let’s clarify which kinds of user research data exist. We consider three types:
- Raw data:
The first type is raw data. Raw data consists of interview recordings, notes, survey answers, highlighted phrases, quotes, or observations.
The second type of user research data are insights. Insights are derived learnings and interpretations, e.g. from previous user testing sessions.
The third data type is presentations, containing management summaries and communicated recommendations. These are more for communication and aggregation.
What is invalid user research data?
Imagine being a user researcher working for a major music label being and responsible for target group insights. Having collected a lot of user research data in the early 2000s, you’ve derived an insight saying that the most common way to listen to music is CDs. This insight is supported by several facts extracted from user interviews about their listening habits.
The finding is not false, but looking at the year 2020, it’s easy to see it’s not valid anymore.
As you see, there are expiration factors for data. But not all of them are as simple to spot as in our example. Thus, you have to check whether the “milk” aka our user research data is still “good” aka valid before using it.
How to validate user research data?
To explain that, let’s look at it from the perspective of a user researcher: User research data is invalid if it provides no value for a research question and its related following actions. Or even worse, if it causes damage to your results.
In practice, it may be difficult to assess whether this is the case. There is no silver bullet, because evaluating user research data does not work by making a binary decision.
Therefore, let’s keep it simple. User research data is valid if both of the following statements are true:
- You can draw benefits for your current research from it regarding content, method learnings, or derive hypotheses for future research. If this is not the case, remember: As long as it’s not obviously invalid, user research data can also be beneficial for future research topics.
- You can access the relevant data and connections, so there is enough information for gaining a proper view of the context.
What leads to value loss of user research data?
Referring to the metaphor from the beginning: What is it that makes the milk go sour?
While in our example it is the lactic acid bacteria that make the milk go sour, in user research there are different kinds of “destructive” actions that lead to validity loss of user research data. These actions appear in the following dimensions:
- active – e.g. deleting or unlinking data
- passive – outdated data due to changes (surroundings, or the users’ knowledge and skills, e.g. regarding new technologies)
- procedural – data processing and the therefore used methodology
- external – changes you cannot influence, e.g. operating system updates, regulatory laws
- internal – changes due to decisions made, e.g. company strategy pivots
Part of each study or information collection are the “assets” connected to it. With the term “assets”, you summarize explicit things like a product, a certain feature, a task, or the users themselves.
Product iterations or a strategic (re)orientation of the target group change these “assets of interest”. And when an asset changes, this may affect the data connected with it.
In user research, you derive insights from facts. Facts are objective bits of information you’ve gained from experiments you’ve conducted.
One of the things that support the validity of a fact is the links to its context. For example, in which situation a user expressed a certain desire. Without any context, a quote is worth nothing. But even if you are familiar with the current context, you often may encounter major context changes in the future. Consequently, this causes a decline in value.
A typical example of evolving information could be the changing surroundings of your target group as shown by the current COVID-19 pandemic. Within a few months, online shopping behavior changed massively. Therefore, especially when working with old data, it’s crucial to check whether the data’s context is still valid.
A user researcher is not a machine. There will always be (un)conscious influence on statements and interpretations. The reason for this is biases. Biased information that finds its way to research data is often invalid input. So, there should at least be labels which enable the researcher to identify this kind of input.
Missing level of detail
To understand your collected user research data you need to know enough details about it. This strongly correlates with the availability of contextual information.
Conducting user research is always a trade-off between gaining as much information as possible and putting an appropriate amount of effort into the documentation process.
In general, a full recording of an interview or interaction has one of the highest detail levels. That is the data format with the smallest possible loss of information, but also with the highest documentation effort.
In a wonderful presentation about insights, held at a meet-up, Nikki Anderson from User Research Academy talked about the value of having users recorded on video. According to her, videos make it easier to put ourselves in the users’ shoes.
Based on her statement, it’s quite clear: We lose opportunities to derive insights if there is only a written transcript of the original data.
Depending on how you handle data, this could be a reason why older insights might be difficult to review. Therefore, before deleting original video data (e.g. because of PII commitments) it’s important to document the relevant context, observations, and actions to at least keep the validity of the major facts.
A similar issue is sloppy notetakers: When notes e.g. taken during an interview are incomplete, you end up with an inconsistent note collection. Thus, important pieces for confirming the relevance of an insight might be absent.
Losing implicit information
You might wonder why the detail level of user research data can decline over time.
On the one hand, a reason for that is that humans have an implicit way of understanding things. This means that during a research session, a researcher disposes of the full amount of related information but does not write everything down.
On the other hand, humans’ ability to remember things is limited. There are things like the current daytime, mood, product version, or some quotes which don’t make it into the final notes.
Since some pieces of knowledge remain implicit, they are hard to recall later. Thus, in those cases, it’s not possible to confirm validity anymore, even though the data itself might still be valid.
Obligatory data manipulation
Sometimes, losing details is intended in terms of privacy. This refers to the anonymization of recorded data as well as the de-linking of participant pool information. De-linking means to replace person-related data with abstracted, non-personal information.
Especially video data is to be handled with care because it contains personally identifiable information (PII). This includes camera images, voice recordings, or information shown on the test user’s screen. According to the rules and out of respect to the test users, the recordings must be anonymized or deleted after a certain time.
Missing access to user research data
A major issue when working with old user research data is to comprehend the connections between different bits of information. Or, speaking in terms of Atomic UX research, a common user research framework: It can be hard to track back the links between facts and insights. This problem especially appears when using tools that don’t have functionalities for accessing, filtering, and visualizing user research data.
How can you preserve user research data?
This all sounds scary? It doesn’t have to. What we want to do next, is to provide you with information about how to preserve your user research data.
Previously, we’ve explained five things that compromise your user research data validity. So, here are our solutions:
- Set up storage for user research data, so you can work with your data in a sustainable way in the long-term.
- Use a proper link structure when connecting facts.
- Be aware of biases. The basic principle is to trust your colleagues with what they contribute. Nevertheless, remove strong biases or personal opinions from your user research data storage.
- Keep your expired user research data actionable: Extract what is still valid, collect meta information, and derive future research questions.
- Implement a user research data validity checklist to your workflow, so that only valid data makes it to the storage.
Store your user research data!
We admit it is natural for user research data to lose relevance over time. But even outdated user research data is still valuable because you can learn something from it. And from what you’ve learned you can derive new research questions – it’s a sustainable process!
As you may have noticed, the validity of user research data is strongly related to the benefit return of storing it in the right way. If you take the right precautions, you’ll have a super fridge for your milk that will prevent it from going sour or at least slow the process down.
The proper way to face this is to use a user research repository as the central tool. This allows you to handle complex data and to support your UX team’s workflow. With consider.ly, we’ve built such a tool. During this process, we’ve stumbled upon the issues mentioned here and wanted to share how we improve user research data validity.