3 Ideas that will completely change how you think about data
There is no inherent value in any piece of data because all information is meaningless in itself. Why? Because information doesn’t tell you what to do.
— Beau Lotto
As I write this I’m sitting in a small conference room on the second floor of an office building. The view from the windows is a paved courtyard down below roughly 25 feet from the building with some tables, chairs, and well-manicured landscaping. I can see that the sun is shining, and it looks like a lovely day. Based on that data, should I go work outdoors? Consider your answer, and we’ll come back to the question later.
If you are a designer, engineer, or in any role that creates things, you probably hear a lot about “big data” and being “data driven.” The assumption is that data equals insight and direction. But does it? Data, any data, in any amount brings with it problems that make it very dangerous to rely on alone. Let’s consider a few of them: First, data is just information and alone does not represent objective reality. Next, whatever data you have is never, ever complete, and finally, getting more data does not necessarily mean more clarity. Let’s look at these in more detail.
Data is not reality
Humans are great at making decisions based on their context and history, but we’re pretty bad at seeing the possibilities beyond that. Here’s an example. Read the text below aloud:
If you read “what are you reading now?” you did what many English readers would do with the same “data.” You did this even though there is not a single English word in that sentence. You’re able to read something meaningful by taking both the context of this article and your history with the English language and filling in the blanks. Also note that I asked you to read the sentence, so you saw the word “reading” in the letters. That priming helped determine the outcome. Not everyone reads it the same way however. If you were munching on something, or sitting in a restaurant, you might just as easily have read “what are you eating now?” And for anyone who doesn’t read English the letters would be what they really are, gibberish. The point here is that how we process data is highly contextualized by the individual doing the processing. Often we will come to the same conclusions based on our shared history or context, but just as often we can come to different conclusions from the exact same data for the same reasons.
All data is missing something
Small data, big data, it doesn’t matter. All data is incomplete at some level. To demonstrate, let’s imagine you have to create a software product and decide that the best way to focus your work is to create a profile of your intended customer. Your expectation is that the profile will provide some insight as to what to build. You create a “persona” named Linda from data you collected.
- A female
- 31 years old
- A Philosophy major
- Is deeply concerned with issues of discrimination and social justice
- As a student, participated in anti-nuclear demonstrations.
While this data might be useful for a profile, almost no one would call this a complete view of a person let alone a population. That being the case, based on the data given, which is more probable about Linda, choosing from the following scenarios?
- Linda is a bank teller
- Linda is a bank teller also active in woman’s rights
If you are like more 80% of the people confronted with the Linda problem, you say that scenario 2 is more likely to fit Linda. That response however, violates the logic of probability. If the question is which is more likely, then between number 1 and number 2 the answer has to be number 1 because the set of feminist bank tellers is included in the number of bank tellers, but the reverse is not true. So it’s more likely that Linda is a bank teller than a feminist bank teller. Why do we make this mistake? Without getting into all the behavioral psychology, the basic idea is that scenario 2 tells a better story, so we prefer it. Put another way, there is a lot missing in this data set, so our brains take what is there and fill in the rest. The apparent specificity allows us to construct a story that seems to make the most sense, but the logical reality is the opposite. In this case most people are just ignoring an obvious logical error, but you might imagine even more data about Linda — how she dresses, where she lives, who she associates with, etc. that might lead you to build an even more complete profile of her. But that profile would still be missing information and could very well be wholly inaccurate in terms of understanding what product a real customer actually needs.
More data, less clarity?
Data alone can also have the effect of clouding our ability to see creative solutions, even to simple problems. Here’s an example:
You need to attach a candle to a wall (a cork board) and light it in such a way that candle wax won’t drip onto a table directly below.
To do so, you may only use the following along with the candle:
- a book of matches
- a box of thumbtacks
Any ideas? Let me give you more data, say a “bigger data” version of the problem, and see if that helps:
This little experiment, known as Duncker’s candle problem, has been tested on a variety of subjects all over the world, and while they come up with a lot of creative ideas most don’t solve the problem. In the rare cases where they do, the solution is usually terribly complex or inefficient.
The best (and simplest) solution is to empty the box and tack the box to the wall to hold the candle. So simple, right? But that is not what most people come up with, at least not right away. The description of the problem is pretty limited, but it seems as if providing a picture in addition to the description doesn’t provide more help, and may even reduce the ability to find the solution. What is happening here? First, the problem statement is that we have to attach the candle to the wall, and we have a preconception that thumbtacks are used for attaching stuff to walls. In addition, the description and corresponding picture are establishing that the box is a container for the tacks. These descriptions create biases about the objects that are not easy for most people to overcome, making it very difficult to see other ways the materials can be used.
Over the years researchers have tried different ways to improve the ability to see the solution more quickly. Some ways that have worked include changing the description of the available items to:
- a book of matches
- a box
In a similar vein the picture is changed to:
These subtle but important changes make a big difference, and increase the chances that participants will find the solution, or find it more quickly. In essence, it improves creativity. Why? This second description and picture help remove some of the bias noted above, and allows us to see more clearly that the box could be used as a shelf. A box and thumbtacks is a different way of looking at the data than a box of thumbtacks. Viola!
The bad news to all this is that data by itself is at best meaningless, and at its worse, misleading. In most cases it will tell you very little or nothing about what to do. Unfortunately, that’s not how many professionals treat the data they are given. I often hear colleagues, in the midst of needing a design or business decision, ask “what does the data tell us to do?” The real answer: not much.
If we stopped here things might look bad for our data-rich future, but it’s not hopeless. Here are three ways you can approach data that will enrich your creativity and enable you to use the information you get in very powerful ways.
1. Experiment with the recipe
I work in a group that creates things that may be used by half a billion people. I manage the data science team for that group and we are increasingly getting requests for more data about our business.
From my experience, there’s one piece I’ve left out of this discussion that makes all the data we have, and will continue to collect, useful. That piece is you and your creativity.
See, data is meaningless only if we are expecting objective truth from it without factoring in our perceptions and assumptions and getting past those with our creativity. What I mean by creativity in this context is the process of asking questions and experimenting. Creativity allows us to take the data we have, question our starting assumptions about what the data is telling us, and experiment until we make something useful out of it. The title of this article is about not being data driven, with “driven” being the key word.
The idea here is that we should use data as information, not as insight. Put another way, it’s not about the ingredients, it’s about the cook. Ingredients alone don’t make a meal (at least, not a good one). And even great recipes don’t come without a lot of experimentation and failed attempts by the people who create them. In the same way, the human part of the data pipeline is the most valuable part, and this is especially true for those of us in creative or innovative fields. For data to support truly creative or innovative outcomes, we must allow it to inform us of the facts so we can ask questions and experiment with the “adjacent possible” to discover the insights and potential that the raw data doesn’t provide. This is true for the following reasons:
- Experimentation leaves many possibilities open
- Experimenters expect, and even celebrate, failure and uncertainty
- Experimentation keeps the process open to change and adaptable to discoveries
Experimentation is a bit of a cold, clinical word, but you could also use the words exploration or even play. Experimentation supports the idea that there are no preconceived outcomes, leaving open many possible results. For this to happen, we have to start with the idea that “success” can come in many forms, or even no form at all. That means when you improvise with the recipe you may completely fail at making a dish you want to eat, or you may invent a whole new cuisine!
2. Question everything
Experimentation and play are ways to explore new possibilities. The best way to put exploration into practice is to start with questions. To put some of the ideas above to the test, go back to the candle problem and see how you might take the data and question it to come up with new possibilities. For example, given the candle, the wall, the box, and the thumbtacks, I might ask some of the following:
- What would happen if I removed an item from the list, does that help me in any way?
- What if I turned everything upside down, does that make a difference?
- What would I see if I take all the matchsticks out of the matchbook?
- What if I took the thumbtacks out of the box?
- What if I tried to stick everything to the wall with the tacks?
This is just a small set of what you’re probably seeing is a large set of questions I can ask about the candle problem data. And the last two questions in my short list start to get toward a possible solution as I’m changing the idea that the box must hold the tacks while still using the tacks for their intended purpose. It becomes an almost magical transformation of my thinking. And this is what I am able to do alone, but there’s even more magic when I include others.
3. Think inclusively
I mentioned the “adjacent possible” above. For most of us, our creativity allows us to explore not all possible outcomes, but only a small portion of what is possible, limited by our history, biases, and perspectives. This is how our brains evolved. We create memories throughout our lives and draw from those memories, or our “history”, when we need to make decisions about the future (any future, immediate or long-term). This is why we interpret data differently. We only have our own history to draw from, and everyone’s history is slightly to vastly different from each other. The more diverse a person’s history the more adjacent possibles she has to draw from, but the number of possibilities is still limited — one person’s brain can only hold so much.
Enter the diverse team. The more a team consists of a diversity of backgrounds, perspectives, culture, education and even professions, the more diverse the adjacent possibles the team will bring to any given problem or set of information. Data, rather than being the driver of creativity, brings opportunities for different perceptions, ideas and, most importantly, questions. The more homogeneous a team is the more efficient it might be, but it’s almost certain to be less creative, and creativity is what you desperately need when solving difficult problems.
While diversity is not a magic bullet — teams must be willing to get out of their comfort zones and embrace their differences — diverse teams are generally smarter than homogeneous ones.
Being creative in a data-rich world
Data is becoming an increasingly important part of our personal lives, our businesses, and our work. Those of us who spend our days solving difficult problems will rely on this data as a tool to help us understand our world and do new things. But data should not drive us. It should be a signal from the wider world that we use to help answer questions and ask new ones. The insights must come from us.
Here we’ve explored some reasons why relying on data as a driver is a bad idea. But we’ve also looked at ways to turn information into creation:
- Acknowledge that we, and those we work with, bring own history to any given set of data, biasing our judgement.
- Experiment, explore and even play with the data through questions.
- Bring diverse points of view and unique perspectives to problems, getting as many “adjacent possibles” as we can.
So next time you’re faced with a “data-driven” scenario do this: instead of looking for the answers the data provides, look for the questions it generates.
As to whether I decided to go work out in the courtyard? Well, I left out a critical piece of data. It’s early spring here in the Pacific Northwest, and it’s only 49 degrees outside. Though the sun was tempting, before heading out I wondered how cold it was, and that was definitely a question worth asking.