1) Don't let the drive for a narrative overcome your priors; aka the slow, sad decay of fivethirtyeight.
fivethirtyeight.com/features…

3) Sample sizes, mostly.
The sample sizes so far are pretty small -- around 60 3rd down attempts per team. The difference between the Ravens and Chargers completion % is about 14% (35% vs 49%); the standard deviation due to raw luck here is ~40%/sqrt(60)*sqrt(2) ~ 7%

11:19 AM · Oct 16, 2021

4) So this is a 2 standard deviation effect. Big, right?
Not really. There are 15 games this week; just choosing the largest discrepancy from those games would yield a result this large on average.
And looking specifically at this game, there are a ton of possible factors.

8) That's what teams hire departments to try to do--come up with actionable analyses that are better than public ones--and combine with proprietary data and tracking.
It's not impossible--it's just hard. To write great content, you have to be great at it, and try really hard.

10) Over time, like so many things, its content has become a random milieu of smart-but-not-smart-enough takes, and models that aren't quite good enough to beat common sense.
See, e.g., their 2020 election model--not terrible, but worse than "meh, < 50%, but at least 25%".

11) It's been 8 years since fivethirtyeight's presidential election model was more accurate than the in-house models I helped build, and the rate of real groaners has skyrocketed (fivethirtyeight.com/features…).
The sports journalism, meanwhile, has been even worse.

