• 1 Post
  • 752 Comments
Joined 1 year ago
cake
Cake day: October 9th, 2023

help-circle











  • First, we need to distinguish Silver’s state-by-state prediction with his “win probability”. The former was pretty unremarkable in 2016, and I think we can agree that like everyone else he incorrectly predicted WI, MI, and PA.

    However, his win probability is a different algorithm. It considers alternate scenarios, eg Trump wins Pennsylvania but loses Michigan. It somehow finds the probability of each scenario, and somehow calculates a total probability of winning. This does not correspond to one specific set of states that Silver thinks Trump will win. In 2016, it came up with a 28% probability of Trump winning.

    You say that’s not “getting it wrong”. In that case, what would count as “getting it wrong”? Are we just supposed to have blind faith that Silver’s probability calculation, and all its underlying assumptions, are correct? Because when the candidate with a higher win probability wins, that validates Silver’s model. And when that candidate loses, that “is not evidence of an issue with the model”. Heads I win, tails don’t count.

    If I built a model with different assumptions and came up with a 72% probability of Trump winning in 2016, that differs from Silver’s result. Does that mean that I “got it wrong”? If neither of us got it wrong, what does it mean that Trump’s probability of winning is simultaneously 28% and 72%?

    And if there is no way for us to tell, even in retrospect, whether 28% is wrong or 72% is wrong or both are wrong, if both are equally compatible with the reality of Trump winning, then why pay any attention to those numbers at all?


  • We are talking about testing a model in the real world. When you evaluate a model, you also evaluate the assumptions made by the model.

    Let’s consider a similar example. You are at a carnival. You hand a coin to a carny. He offers to pay you $100 if he flips heads. If he flips tails then you owe him $1.

    You: The coin I gave him was unweighted so the odds are 50-50. This bet will pay off.

    Your spouse: He’s a carny. You’re going to lose every time.

    The coin is flipped, and it’s tails. Who had the better prediction?

    You maintain you had the better prediction because you know you gave him an unweighted coin. So you hand him a dollar to repeat the trial. You end up losing $50 without winning once.

    You finally reconsider your assumptions. Perhaps the carny switched the coin. Perhaps the carny knows how to control the coin in the air. If it turns out that your assumptions were violated, then your spouse’s original prediction was better than yours: you’re going to lose every time.

    Likewise, in order to evaluate Silver’s model we need to consider the possibility that his model’s many assumptions may contain flaws. Especially if his prediction, like yours in this example, differs sharply from real-world outcomes. If the assumptions are flawed, then the prediction could well be flawed too.



  • You are describing how to evaluate polling methods. And I agree: you do this by comparing an actual election outcome (eg statewide vote totals) to the results of your polling method.

    But I am not talking about polling methods, I am talking about Silver’s win probability. This is some proprietary method takes other people’s polls as input (Silver is not a pollster) and outputs a number, like 28%. There are many possible ways to combine the poll results, giving different win probabilities. How do we evaluate Silver’s method, separately from the polls?

    I think the answer is basically the same: we compare it to an actual election outcome. Silver said Trump had a 28% win probability in 2016, which means he should win 28% of the time. The actual election outcome is that Trump won 100% of his 2016 elections. So as best as we can tell, Silver’s win probability was quite inaccurate.

    Now, if we could rerun the 2016 election maybe his estimate would look better over multiple trials. But we can’t do that, all we can ever do is compare 28% to 100%.





  • If the weather forecast said there was a 28% chance of rain tomorrow and then tomorrow it rained would you say the forecast was wrong?

    Is it possible for the forecast to be wrong?

    I think so. If you look at all the times the forecast predicts a 28% chance of rain, then it should rain on 28% of those days. If it rained, say, on half the days that the forecast gave a 28% chance of rain then the forecast would be wrong.

    With Silver, the same principle applies. Clinton should win at least 50% of the 2016 elections where she has at least a 50% chance of winning. She didn’t.

    If Silver kept the same model over multiple elections, then we could look at his probabilities in finer detail. But he doesn’t.