Thanks to Ryan Thibodaux’s tracker, we can use statistical probability methods to project the final vote totals for our favorite players.
Every other day (my high school has block schedule), I sit in AP Statistics. It’s no surprise that I love this class, considering how much I enjoy baseball statistics and their analysis. I wouldn’t be writing for Beyond The Box Score if I didn’t love stats.
Since the first day of school, I have been hoping to learn something through my AP Statistics class that I could use directly in my baseball writing.
When Ryan Thibodaux began to collect this year’s Baseball Hall of Fame ballots (as he has done every year since 2014), I realized that I could use my limited AP Statistics knowledge to build a model to predict the final vote totals — and with that, the odds of election — for each player in the Class of 2019.
I didn’t do this on my own, however. My teacher, Mr. Grossman (give him a follow on Twitter if you so please… he tweets some good content), helped me to finalize — and hopefully perfect — my model with his own tips. We spent some time after school, writing formulas and deriving some standard deviations on the board. And, once we had settled on a model that we thought could work decently well, I inputted all of the formulas into a spreadsheet and have kept it updated since.
I’ll explain more of the methodology behind the model in a future post (hopefully to be posted tomorrow or this weekend), but in short, I expect it to work relatively well. Applying the same algorithms to the Class of 2018, the model was able to come within an average of 2.7 percent accuracy. I caution you in assuming that I will have similar results for the Class of 2019. I built this model based on what worked for last year. Every year is different, and things may happen that could cause my overall predictions to fluctuate rather wildly. I obviously would love it if I could fall within 3 percent margin of error, but sometimes that’s not how life works.
With all of this said, here are my projections for the Baseball Hall of Fame Class of 2019, based off of the first 50 ballots that Thibodaux collected. (I know Thibodaux has since collected many more ballots, but I’ll explain why keeping this capped at 50 is important more in the methodology.)
There are a few interesting takeaways here. For one, my model expects Mariano Rivera and Roy Halladay to be shoo-ins for the Hall. This is no surprise to anyone: Rivera is the greatest relief pitcher of all time, and Halladay was one of the most dominant pitchers of his era. It also expects Edgar Martinez to be inducted, surely a welcome sight to many on Baseball Twitter who have been clamoring for his induction long before he reached his last opportunity to do so.
Martinez, who was trending at 90 percent of the vote at the 50-ballot marker, is expected to be significantly hurt by private ballots, making him a much closer call than otherwise expected from the early returns. It is entirely possible that Martinez makes significant gains among private voters who weigh him more heavily in his final year, but it is hard to know how much of a bump he will receive.
So far, Edgar has gained a net eight votes through 76 ballots. At this pace, he would gain about 43 votes and would be elected to the Hall of Fame with 82.5 percent of the vote. The question, of course, rests on whether he will be able to gain votes at the same pace among private voters. I do not know the answer, but I would be willing to bet that he should do well enough among private voters to give him more than a 64 percent chance of being elected. My model is incredibly conservative in that regard, so the fact that Edgar still has this high of a chance while expecting zero gains among private voters bodes well for his future.
Among the rest of the candidates, Larry Walker is expected to see the biggest gains at +17.1 percentage points from his 2018 total. (Walker is actually projected to see the biggest gains overall.) With just one year left on the ballot after this year, Walker will need even larger gains now in order to even sniff election in 2020.
Mike Mussina is also expected to see big gains this election cycle, as he is the only other candidate besides Rivera, Halladay and Martinez with greater than a 5.6 percent chance to be elected. If he does not make the Hall of Fame this year — my model gives him just a 28.3 percent chance — 2020 is a real possibility. He’ll only be in his seventh year of eligibility next year, so at some point he will be enshrined.
The steroid guys — Barry Bonds and Roger Clemens — expect to see modest gains of 3.2 and 4.4 percent, respectively, but most voters seem pretty set in stone as to how they feel about them. I don’t know that they will ever be enshrined, which is a shame.
Lastly, among the candidates who can return for election in 2020, only Lance Berkman (22.8 percent chance he reaches 5 percent of the vote), Roy Oswalt (0.0 percent) and Michael Young (0.0 percent) are in grave danger of falling off of the ballot. We’ll need to see big gains the rest of the way for them to get the approximately 21 votes necessary to hang on the ballot.
If you have any questions about these projections, you can absolutely write in the comments below or send me a Tweet @DevanFink. I hope to have the methodology written up sometime soon, so please be on the lookout for that.
I hope that you enjoy the rest of the Baseball Hall of Fame vote season as much as I have already! Good luck to all the candidates (and fans of said candidates) over the long remaining process. It will be tiresome, but I hope that I was able to give you a little bit of an insight as to what may happen when results are announced in late January.
Devan Fink is a Featured Writer for Beyond The Box Score. You can follow him on Twitter @DevanFink.