Tuesday, October 19, 2021

Don’t count on Facebook to calculate your odds | A slight sceptic

I do not work for Facebook. I’m not even an active member of Facebook. But sometimes I can’t help but take part in their activities, such as the part they played in trying to “save the world”.

The thing that really hit me was their very poor use of statistics. That’s not the fault of the statistics themselves, as they’re generally quite good and detailed. But a crucial difference between fact and speculation is to grasp the central importance of guesstimates in all types of assessments of probability.

A guesstimate is a range of values that don’t fall within a more exact scientific term, eg factor to pass 6. The first value is the highest one and a range of first values comes from there. Once we have that, we can make guesstimates on a series of numbers from the range. But we never actually know precisely what any given multiple will be, because each variable has its own probability. So it is all right to speculate about a range of number from what is always assumed to be a first value. But the more precise the multiple, the more precise the guesstimate becomes. In fact, saying: “S#!W! five-a-day people,” almost precisely means: “Let’s say five-a-day people and avoid guesstimates.”

For Facebook, it was key that their equation was not fully scientific. Very fine mathematical conventions mean you cannot use a range of values to throw out a known approximation from your expected outcomes, but assuming that the number that represents probability is commonly accepted, you can still speculate about it.

So their predicted likelihood was “13”. Although they didn’t explicitly say so, they were probably assuming that there were lots of people who constantly had their photos taken in a better place (12) and that there was a lot of distance between them (10). But predicting any number based on expectations of one’s advantage or disadvantage against others would come up with an incorrect probability as large as you can imagine.

So Facebook probably assumed that they were going to have people liking their page at a rate of about 1,000 per second. Not possible. There are as many as 10 million members, at last count, and even Facebook’s non-profit subsidiary has a mission to get one billion. The current standing queue to join the service is now more than 12 times that. Therefore, the number has to be quite large. I haven’t checked it off, but my bet is that it’s much larger than the usual number of multiple of 1,000. I’d say it’d be double the power of the basic X where multiplication is zero (1) times a million, so 2× 2 = 10^10. Yet even the most-watched series would only get 1^1m likes.

So Facebook had probably made a guesstimate of 20^10m likes – although, if they’re doing this for their videos, that’s being pretty generous about the power of videos.

When we start speculating, in every case, we’re working with a range of conceivable possibilities, and some of those range as far away as the small print. It’s all fine if the range of theories we’re throwing out lies within the accepted – but legal – use of ideas like standard product naming conventions, size, and so on. But there are the concepts we try to redefine on the basis of numbers like randomness, multiplexing, and random discrimination (or statistically as-yet unexplained different levels of associated information).

Extravagant propositions have an impossible ending, but the nonsense does not have to be perfect, and indeed the laws of probability are very good at providing us with other alternatives for where to stop.

Statistics has always been a strange discipline. Lots of innovations in particular, like big data, are never going to go quite as well as they’ve done and therefore still lives in need of more improvement. Averages are not our friends. For example, the fact that you have higher confidence in risk estimates if you also spend more time at job interview HQs than I do really suggests that I should be the better bet than you. If I know that I spend more than twice as much time at interview HQs as you do (albeit over a ten-year period) why should that change your future likelihood of being hired? The fact that my knowledge of the risks is so clearly superior to yours (at a 10-year interval) seems to suggest that it would.

Still, in the end, Facebook thought they’d have one billion “likes” by the end of 2018, and we’ve got one and two years to go! Counting down, can’t I count on you?

More articles

Latest article