The data, the judgement, and the decision

Posted on Posted in Analytics Strategy, Predictive Analytics

The data is imperfect. Judgement is imperfect. Decisions are imperfect.

The question isn’t about perfection.

It’s about progression.

What becomes true if we were to focus on progression?

Credit goes to Matt Gershoff for inspiring this post. A remark he made at a recent #miToronto grabbed me. To paraphrase, he said that when you stop obsessing over which model is right or wrong, because all models are wrong by definition, and start focusing on just making it better, you get a lot further.

He used the term liberating.

And it is.

The Data

Reality is flawed, and as a direct result, it generates data that is flawed.

The machine housed in your skull is a serious piece of technology, but there are all sorts of problems with it. It does the best it can given the laws of physics it’s been given and caloric budget of the planet. We get some good performance from it. But it does all sorts of things that are sub-optimal.

The number of data points that we’re generating and having recorded in machine readable form, as a society, is increasing.

And it’s not surprising that there are a lot of flaws in the data we produce.

The number and variety of tools that we have access to, or can buy, to generate data about data about data, has never been greater. Analytics, DMP, testing, log files, lots of data exhaust (under every server and every stone), session playback, the python stack, the R stack, the powerpoint, the email, and on and on and on.

And it’s all defective in some way.

It’s defective all the way down to reality itself.

The HTTP Cookie is an extremely coarse technology to understand people. Even the concept of a session has defects. Even the humble pageload and video initiation has problems.

The sampling of data generates all sorts of issues, even two hundred years after it was invented, causing a lot of grief. And yet, the law of large numbers, and statements about the null hypothesis are hard baked into the physical structure of our universe. As a sample decreases our ability to make precise statements about the general whole diminishes. Gauss discovered things. Balls exist and come out of urns in a certain way. Two dice rolled over very long periods create a curve that may be extremely displeasing to many. Welcome to natural laws that form the basis of our reality. Feel free to rage all you want. It’s totally ineffective but may make you feel better.

Interviews generate data. Much of it very difficult to interpret. And worse, my own brain can trick me into believing things about interviews that aren’t true. And worse still, sometimes an interviewee isn’t even aware of something in a specific way, and I’ve modified their state by the question, introducing error into subsequent questions. (Unintentional push poll indeed).

In many ways, this is where a lot of discussions just end in analytics and management circles.

“WELL — the tools really limit what we can find out – so I’ll just sit here and complain at you about the vendor community.”

That really isn’t a way forward.

The lumps in the data are driven by the lumps in reality.

The Judgement

Judgement is defective. And often it’s just because of the way that brain works.

It’s not perfect.

We’re composed of a lot of knowledge and experience. We all got preferences and expectations. And we’re always subject to the constraint of time.

Judgement could be made better with data. If you use it. And use it well.  A good example is the 40/70 rule — if you make a decision before 40% of the data comes in, you’ll probably miss the mark. If you wait until after 70% of the data comes in, you will have waited too long and the decision will have reduced effect.

The models we use to inform our judgement are also defective. By definition a model isn’t a perfect simulation of reality. All models have to be wrong in some say. But, judgement that’s informed by a model could be better than judgement informed by a dice roll.

If the model doesn’t exist, invent the model.

If the data doesn’t exist, invent the data.

Use judgement to inform just how much data you really need to make a decision.

The Decision

Progress relies on good decisions.

Under the constraint of a defective reality, a defective brain, defective data, and defective models, you make the best decision you can.

And in general, if somebody is on the trajectory of learning, of making better decisions, systematically, that’s a good thing.

That’s like interest compounding over time.

It doesn’t need to be optimal. It doesn’t need to be perfect. It just needs to be better.

Progress is progress.