What do you think is the ONE variable that can completely derail your data-decision efforts? No matter how good your skills or how advanced your model?
Here’s a few hints: It’s a variable that is ever-present, regardless of the type of data you have, the quality of your data team, the state-of-the-art-ness of your model. Several factors that are likely to affect this variable are used to group data points for this variable into segments. However, the effect of a small change in factor x has such a varied set of changes on y, even within the same segments, that no level of mathematics could reliably predict the impact of x on y (mandatory disclaimer with AI growing as fast as it is: *yet).
If you guessed the answer as human behaviour then this post could definitely have been a Tweet for you. If you can’t yet relate, let us break down our thought-process…
Our Thought Process
- It’s a variable that is ever-present: can you think of one company that is a 100% run by machines or has 0 human customers? No? so this is a given.
- Several factors that are likely to affect this variable are used to group data points for this variable into segments: we can and do separate people based on age, country, gender and plenty of other factors. This holds too.
- The effect of a small change in one factor x has such a varied set of changes on y: say you’ve segmented your customers into groups based on age. Can you predict how a 24 year-old would behave when shown a marketing campaign x? Would the behaviour be the same in the 24-year-old today vs tomorrow? or between two 24-year olds today? Thought so.
You might say that “well, in aggregate, human behaviour CAN be modelled”. That’s what economists have been telling us, and we’d have to agree, but under one condition: when you have enough people that behave in the same way.
More elaboration required? Well, if you have enough 24-year-olds that have the exact same habits, day-in, day-out; if they do the exact same thing from the minute they wake up till the minute they sleep; have the same desires today vs tomorrow; essentially have the same “utility” for stimulus x, and you do some random testing for the stimulus on a sample of this group, then yes, you could predict behaviour and the results would be fairly accurate. Two problems with this:
a) Do you yourself do the exact same thing, day-in, day-out, from the minute you wake up till the minute you sleep? Our Simulation Overlords have programmed us better than that, haven’t they?
b) Do you have the same utility for a car as your father? Or as your best friend who you grew up with? Nope.
Why It Matters
Turns out understanding and accepting that human behaviour is fundamentally unpredictable is the key to making better decisions.
Since no company is a 100% run by machines or has 0 human customers, any decision-making process or model will have to interact with humans at some point. But since human behaviour can’t be modelled, no model will ever be able to take into account how a human will behave as a result of a particular decision.
As an example, consider what happens when your state-of-the-art prediction model tells you that tomorrow is going to be a bad day. You decide you’ll go to your boss after she’s had her morning coffee, because she says she can’t think straight without it. So you do the same this time too. Unexpectedly, she fires you for not telling her sooner as it’s too late for her to make a decision as to what to do about it. What changed? If you know the answer, model it immediately because that’s the million dollar question!
All this while, your prediction has not changed, tomorrow will be a bad day. One day a decision was made to do something about it, another day it wasn’t. Facts don’t change, humans do.
What do you do then?
You might not be able to predict human behaviour but you could measure some factors that indicate a potential change. For example, you could measure Daily Active Users to see if decisions you are making with regards to your e-commerce are leading to an increase in the user base or use Cohort Analysis to see if decisions made have led to higher user retention overtime. In the end, what you can do is what we always say you should do (see below)…
Keep Data. Decisions. Repeat-ing,