Why Do Most Wild Card Series End in Sweeps?

A simple mathematical explanation for why we don't see more Game Threes

Sep 30, 2024

In 2022, Major League Baseball added a new round to the postseason. The introduction of the Wild Card Series expanded the playoffs from 10 to 12 teams and replaced the winner-take-all play-in game between non-division-winners with a quartet of three-game series, for which the four top-seeded teams get a bye.

For as critical as I have been of MLB’s constant meddling with the rules, I think the league got this one right. The new format expands the playoff field, letting two more fanbases experience October baseball, without cheapening the tournament. It rewards teams for regular-season dominance with a first-round bye. And it replaces the two single-elimination Wild Card games, which felt anachronous to the established rhythms of the sport, with four three-game series that mirror the basic units of the MLB schedule while still promising a disproportionate number of Game-7-like situations, in which both teams must win to stay alive.

When Mike Trout Became the Best

Lewie Pollis

June 27, 2023

Read full story

Yet the last part hasn’t worked out so far. Of the eight Wild Card series that have been played since the new format was introduced two seasons ago, only one has gone to a decisive Game 3. The other seven were two-game sweeps — including all four last year, creating an awkward hole on the third day of the postseason schedule where the climactic do-or-die games would have been.

Intuitively, a best-of-three matchup between two good teams seems like it should take all three games. But what if I told you that the odds of a sweep are actually higher than the chances of the series going to a decisive Game 3? It’s true — and it takes only some simple algebra to prove it.

Let’s let p equal the probability of the better team winning each individual game. We make the simplifying assumption that this is a constant for a matchup between these clubs, understanding that real-life baseball is more nuanced than that.1 By definition, p must take some value between 0.5 and 1, exclusive.

We next set q as the probability that the lesser team will win each game. This means q is equal to 1 - p, and must be between 0 and 0.5, exclusive.

This yields the following system of equations:

\(p + q = 1\)

and

\(0 < q < .5 < p < 1\)

There are four possible outcomes for the first two games of the series:

Better team wins both: expressed as pp
Better team wins Game 1 and loses Game 2: expressed as pq
Better team loses Game 1 and wins Game 2: expressed as qp
Better team loses both: expressed as qq

How can we prove that a two-game sweep (Scenarios #1 and #4) is more likely than a three-game deadlock (Scenarios #2 and #3)? Let’s start by expressing them mathematically:

\(pp + qq > pq + qp\)

This immediately reduces to:

\(p^2 + q^2 > 2pq\)

We can substitute the equivalent value of 1 - p for q:

\(p^2 + (1 - p)^2 > 2p(1-p)\)

which simplifies as follows:

\(2p^2 - 2p + 1 > 2p - 2p^2\)

\(4p^2 - 4p + 1 > 0\)

\(p - p^2 < .25\)

This statement is true if and only if p > .5 — which is exactly how we have defined p!2

While we’re here, we can also prove the related concept that the better team is more likely to win the Wild Card series in two games than three. This perhaps-counterintuitive phenomenon is a common discussion point around the World Series, when forecasting models’ modal outcomes often have the favorite winning in six games instead of the full seven. It makes sense if you think about the specific paths the scenarios imply: saying the better team will win in seven games means either that you expect them to win three of the first five games, then lose Game 6 (despite their general advantage) before coming back to win Game 7; or that they first cede a 3-2 advantage to the underdogs, then sweep Games 6 and 7.

Plunking Down a Plan to Curb Strikeouts

Lewie Pollis

September 14, 2023

Read full story

There are three permutations of how the favored team can win a best-of-three series:

They win Games 1 and 2: expressed using the foregoing variables as pp
They win Game 1, then lose Game 2, then win Game 3: expressed as pqp
They lose Game 1, then win Games 2 and 3: expressed as qpp

Predicting the first scenario feels arrogant, but the other paths are subtly narrower than you think. The favorite-in-three situations require a loss in either Game 1 or Game 2, respectively, which we know is less likely than the alternative for each individual matchup. Further, the favored team then has to win Game 3, which is probable but not assured. (Note that the odds of a sweep are not conditional on who would win an unnecessary Game 3.)

The assertion is thus that:

\(pp > pqp + qpp\)

which simplifies to:

\(p^2 > 2p^2q\)

and finally reduces to:

\(1 > 2q\)

\(q < .5 \)

which of course is true based on how we defined q.

We can use similar logic to show that the underdog team is more likely to win in three games (qpq or pqq) than two (qq). In this case the stacked conditional probabilities are mitigated by the tautological assumption that the better team is more likely to win each game:

\(qpq + pqq > q^2\)

\(2pq^2 > q^2\)

\(2p > 1\)

\(p > .5\)

Finally, for completion’s sake, we can also prove that the better team is more likely to win in three games than the underdogs are:

\(pqp + qpp > qpq + pqq\)

\(p^2q > pq^2\)

\(p > q\)

This lets us flesh out the full chain of possible outcomes, in order of likelihood:

\(P(FavoriteIn2) > P(FavoriteIn3) > P(UnderdogIn3) > P(UnderdogIn2)\)

\(p^2 > 2p^2q > 2pq^2 > q^2\)

To be fair, this mathematical quirk doesn’t explain the heretofore near-total dearth of Game 3s. The probability of a sweep increases along with p, and the differences between playoff-quality MLB teams don’t move the needle much. Making the aggressive assumption that the better team had a 60 percent chance of winning each game — equivalent to a 97-win pace, while facing an opponent good enough to earn a postseason berth — we would expect to see a sweep 52 percent of the time, a far cry from the 87.5 percent we’ve seen in practice. The boring but much better answer is that weird things happen in small samples.

Florida State, Kevin Kouzmanoff, and Proving Otherwise

Lewie Pollis

December 4, 2023

Read full story

Still, the scale of the noise is a function of the underlying probability. Using the 52 percent sweep probability from the previous paragraph, the odds of seeing fewer than two Game 3s in eight Wild Card series are about 1 in 22 — so, surprising, but more common than rolling snake eyes. Yet if the underlying probability of a sweep were 1 in 3, which is probably what my intuitive guess would be if I weren’t thinking about the math, the chances of seeing at least seven sweeps in eight matchups would plummet to just 1 in 386.

As a baseball fan, I’m rooting for a full slate of Wild Card Game 3s this week. But as a numbers guy, I’m not counting on it. Happy October!

For example, the starting pitchers have outsized impacts on the odds of a given game, so the specific probabilities (or even which team is favored) may change over the course of a playoff series. On the other hand, because home-field advantage is a constant for the higher-seeded team instead of alternating like later in the postseason, the Wild Card round is the least-bad time to make such a generalization.

In the extremely unlikely event that p = .5 — implying that the teams project to be exactly evenly matched after accounting for home-field advantage, probable pitchers, etc. — modifying the foregoing logic will show that it’s 50/50 whether a series goes two games or three.

Why Do Most Wild Card Series End in Sweeps?

A simple mathematical explanation for why we don't see more Game Threes

When Mike Trout Became the Best

Plunking Down a Plan to Curb Strikeouts

Florida State, Kevin Kouzmanoff, and Proving Otherwise

Discussion about this post