The Creativity of Data Science for Creative Content

Posted onOctober 5, 2015 Edit onAugust 7, 2021 by Christopher Berry

Let’s start with a story.

Daan did a traditional fast follow. He calls it Netherflix. His story was: “It’s like Netflix…for The Netherlands!”. At first, he buys rights on the cheap, pays for digital subtitling, and has a successful kickoff. He gets through to 10% household penetration, or roughly 700,000 subscribers, with an annualized gross revenue of about 60 million Euros. The strength of the Euro lets him raid the Anglosphere and he can stock 10,000 hours of content reliably [1].

He gets through the struggle of getting his stack to deliver content and minimize churn. He’s able to host and deliver 10,000 hours reliably, in spite of supporting video players across 11 different front end platforms, and the costs associated with hosting, encoding, and delivering video. Subscribers are successfully able to find something to watch (as defined as an initiation event with a video time spent in excess of 6 minutes), in 99 sessions out of 100. Which is just fantastic because it causes churn [2] to be held at below 1%. This is all while his top line audience penetration growth is flattening out as he reaches the inflexion point on the curve [3].

But he doesn’t live happily ever after.

Others see his success and drive up the licensing cost of his 10,000 hours of content.

Daan’s goal was never to have all of the content. It wasn’t to control an indeterminate number of hours of content. His goal is to get as many subscribers as possible at sustainable churn, under the constraint of paid budget.

For Daan, securing his revenue stream is found in selective exclusivity.

To understand why, you need some idea about the math.

We can abstract out Daan’s business model with a system of equations. The big Y for Daan is valuation. To get a huge valuation, he needs to maintain strong subscriber growth, while minimizing churn, under the constraint of cost.

There are a million of factors that Daan can tweak, but the biggest subset balances content consumption against audience segment. The relationship between balancing the assortment of content against the subscriber base greatly assists the minimization of churn and the maximization of word-of-mouth referral and imitator subscriber growth.

One hypothetical abstraction (and compression) example might be:

Daan has a mental image of what a binger, catalyst, pecker, kiddie, and atrisker is by their associated behavior, and what they mean to his business [4]. He has empathy with his vast subscriber base through segmentation. There are billions of potential behavioural segmentations, some valuable at the microscale, some at the mesoscale, and some at the macroscale, and arguably, Daan has a fully integrated stack that assists with that. The overlapping of specific titles against those segments can assist, greatly, in minimizing churn and managing a portfolio of subscribers. That portfolio of subscribers must be balanced against the behaviours of a portfolio of content.

Catalogue companies excel at that, and the math, and associated technology, is largely known in the literature, and some of it has even been executed commercially.

What’s new-ish is the science of creative.

All of those titles have features. You may be a fan of the way that Netflix generates so many lists of titles based on feature extraction. “Movies with a strong female lead” and “Cool moustaches” are two such examples. Thanks in part to IMDB and Netflix, we know more about movie feature extraction than almost anything else.

Now, consider the application of feature extraction in the context of recommending to Netherflix what it should buy next, and it one step further, what it should produce next.

Daan’s matrix of content along the top / subscribers along the side reveals all sorts of competitive advantages unique to his base. He may observe, for instance, that the Canadian show “Murdoch Mysteries” is especially popular with a subset of his base that also happens to watch “detective series from historical periods”, such as Perot and Sherlock Holmes, not “American procedural detective series”, but also, “movies featuring Angela Lansbury”. This represents a set of 50,000 subscribers, and, since he has demographic information about those people using a census/postal code lookup, he overestimates that there’s another 500,000 subscribers out there that he could attract.

This forms the basis of an insight. He can take the risk of ordering a 26 part series staring Angela Lansbury set in historical New Amsterdam. With generous subsidies and a 0% taxation rate on royalties, he can afford to make a bet on great writing and plan out a long story arc rooted on the observation that longer story arcs generate better creative [5].

Alternatively, he could select from hundreds of thousands of potential insights from the content-subscriber graph. It doesn’t have to be something nearly as stereotypically contrived as the hypothetical mentioned above.

Daan secures about 25 hours of exclusive content that he reckons will reduce churn among an existing segment, attract new subscribers, and, is a temporary hedge against the competitive pressure of new entrants into the Dutch OTT market.

He is making use of knowledge about the market, and about the creative process, to place bets on the output of the creative process. Those 25 hours represent a tiny proportion of the 10,000 hours in his inventory. Those 25 hours, however, are instrumental in generating a brand and a moat around a portion of his subscriber base.

The information he draws from the library can be deployed in a way that generates more predictable results on what is a hard to predict process.

Because the performance of creative is predictable to an extent, it just requires some creativity on the part of data scientists and a creative approach to feature extraction from both the behavioural set and the content set.

One core reason for the divergence between broadcast television content and cable/ott content is caused by what each business knows and how it traditionally (institutionally) does with that knowledge. Netflix, has a fundamentally different understanding of their own respective markets for content than the broadcasters do. The divergence among audiences of valuing divergence itself, at first kicked off by the 500-channel universe, is amplified by the 7.3 billion channel universe.

The reason why shows like The Big Bang Theory persist in the environment is because a lot of people who watch broadcast television watch The Big Bang Theory. A show that you’re “clearly not smart enough to get” is popular amongst people who happen to watch broadcast television. I suppose, too, that you’re not “smart enough to get” procedurals like NCIS, NCIS New Orleans, NCIS Los Angeles, CSI, CSI Miami, CSI New York. Those audiences demand to be blasted with safe, predictable, procedural content. And in response, broadcasters blast them with that content, sometimes to the breaking point [6]. To be extremely normative, shows like 30 Rock and Unbreakable are too smart for broadcast television audiences.

These patterns in broadcast television are predictable to the point of parody because the tastes of broadcast audiences are fairly predictable [7]. The business model is failing in part because broadcast audiences are less valuable to the primary customer of broadcast audiences: the marketers [8].

The business model has also alienated several audience segments. Why give a show a chance if it’s just going to get axed? FOX’s reputation for rapidly cancelling shows early has driven several Canadian re-broadcasters to rebrand the show as being Canadian so as to avoid the stigma.

In contrast, people don’t stream House of Cards for a safe, predictable, experience. Challenging content, like that found in Sense8 or Game of Thrones, is expected by non-broadcast subscribers. To put a very fine point on it, a broadcast is something that happens to you. A stream is something you expend cognitive load seeking and playing. There is something inherent in the language distinguishing an audience from a segment of subscribers.

When the primary consumer of a portfolio of content is also the customer, tremendous opportunities are unleashed. This is the best part.

And this means the best is yet to come for the creativity of data science in creative industries, creatives, and audiences.

The pilot factory hasn’t been great for audiences. The broadcast analytics environment hasn’t been great for producers. I don’t think either have been great for creatives, either.

Netflix, HBO, and Showtime have all demonstrated the advantages of investing in storytelling (writing) has returns. However, the return on some of those investments may not be nearly has huge as anticipated, and there could be a few corrections in a few variables [9]. There is an explosion of experimentation of formats, styles, scripts, and storytelling. In many ways, data science is enabling greater creative risk taking in some areas by significantly reducing the business risk in other dimensions.

There are opportunities for more creatives to find audiences for their work, and for subscribers to spend less time being more entertained. Such approaches may mandate a new form of ArtScience to emerge. In particular, it could be very interesting to appeal to multiple audience segments that may not traditionally have much in common, and entertain them simultaneously. The business justification is present. Certain national-cultural justifications may also present themselves. It’s the promise of the unknown, and undiscovered content approaches, that is invigorating.

It minimizes the risk (uncertainty times consequences) for Daan while it enriches the subscribers and the shareholders. What a great outcome.

[1] Read the Sony emails leak for more insight about these mechanics.

[2] One Netflix engineer mentioned this causal relationship in a talk at RecSys 2014; the claim hasn’t been validated in any marketing science literature.

[3] It’s not flattening because the churners are telling everybody that they can’t find anything to watch, it’s flattening because he’s approaching the mid-point of total market penetration.

[4] Contextualizing audience matrices by behaviours rather than demographics (a/s/l) is a real advantage because it enables a different order of strategic decisions about the interaction of markets and content. There’s no comparison between a traditional TV matrix and a behavioural segmentation.

[5] If you believe, as many creatives seem to, that superior recognition is correlated with superior creative.

[6] Family Feud (1976), Who Wants To Be A Millionaire (1999), and the NBC Law and Order spinoffs (2000’s) are pretty good examples of saturation surges past the breaking point.

[7] The essential formula for “Middle America Family Sitcom” has remained essentially static: Roseanne, Malcolm In The Middle, The Middle, all have minor localization tweaks; Modern Family is Roseanne for the 1%; Fresh Off The Boat is The Middle for the 20%. These variations are much less risky from a spec sheet perspective; whereas the specific variations is focus group polished to be just divergent enough to stand out from the pack. I have to believe that the reason why the Middle America Family Sitcom persists for so many seasons

[8] As a marketer, pushing, say, a car brand, I’m not necessarily as attracted to The Price Is Right or NCIS to hit the 55+F demographic, because I already get a lot of those people for free as I pay top dollar to be on The Big Bang Theory on Monday. The people consuming the most broadcast television are aging. There are more of them, sure, but their tastes for brands are made and the effective frequency among them isn’t nearly as effective. Horrendous, but in-the-demo performance matters to marketers.

[9] AMC’s treatment of The Walking Dead franchise during Season 2 is an example of a darker future.