March Madness for data junkies

[Disclaimer: The following post is partly a reprise of one I wrote last year]

March Madness is almost here, and my workplace productivity is bound to suffer a little (don’t worry Kyte crew — I promise I’ll get all my stuff done). Selection Sunday is this weekend, and then it’s all about bracketology. I always look around the Internetz for a little help, and there’s no shortage of resources out there. There are roughly three ways to approach it…

Tap the hive mind


Yahoo Sports has an application called the “Team Ranker” that’s sort of like a Hot-or-Not for evaluating possible matchups. The theory is that the masses will collectively gravitate toward the most likely outcome. The obvious risk is that the Team Ranker application might be dominated by people who know nothing about college basketball and make their picks more or less at random. Imagine the Yahoo Answers kids attacking this one. Yikes.

Fanboys might be a problem too. Duke and UCLA, for example, have a lot of them – and haters too for that matter, so no matter how viable they might be as contenders, I would worry about people expressing their desires instead of their predictions. Finally, the official tournament seeds and rankings are themselves driven – in a way and in part – by a collection of opinions, so even if Yahoo’s Team Ranker is dominated by true college basketball aficionados, I would expect the results to follow the seeds.

Turn to the Experts

I’ve done well with this strategy in past tournaments, but it’s not a sure bet. Taken as a whole, the experts tend to follow the seeds, and they inevitably split on all the toss-up games, so you still have to use your gut to a certain extent. The other challenge is that the expert commentary you can find is pretty disjointed. There are a lot of bits and pieces out there – separate breakdowns by region and conference, lots of hypothetical head-to-head matchups and riffs on narrow subjects like “injuries to watch” – so it’s difficult to synthesize it into any kind of cohesive set of picks. That said, the free resources I tend to look at are the obvious ones:

Each of these sites has its stable of pundits who crank out a furious stream of blog posts and articles between the time the field of 64 is announced and the first tip-off. The trick is to sift through the noise and spot the nuggets that can help you. Most of all, I look for predictions – especially whole brackets.

DIY science geekery


This is especially fertile ground for data junkies. Impress your friends by rattling off the latest betting odds or spouting opinions about how the Pomeroy Pythag Model stacks up against the Key Game Play stats model – if you can find any of this info for free. If you’re willing to pay, however, there are all kinds of nifty online tools to play with. One called Bracket Brains lets you dive deep into individual matchups. It costs anywhere from $26.95 to $79.95, although they do offer a free version that gives you a taste. Matchup by matchup, it provides a whole range of parameters you can tinker with to help you make your picks.

You can adjust how you think various slices of things like recent performance, strength of schedule and Vegas spread will factor in to each matchup. You can look at similar matchups from past tournaments (based on the parameters you set). You can even view a map showing the distance each team will travel to the game venue. As you tinker with the weightings of all these parameters, the projected outcome of the matchup in question changes in real time.

Another tool called Bracket Caster runs simulations based on each team’s past performance and calculated chances of winning against any other team. According to the description, every possible tournament game has been simulated one play at a time and repeated 10,000 times. Using this data, you can run your own simulations of the regional brackets, or look at a high-level analysis of any individual matchup.

Finally, one category of basketball statistics – efficiency – has become especially popular as a way to measure any team’s true merit and predict its performance in future games.


A team’s offensive efficiency is defined simply as points scored per 100 possessions. Defensive efficiency is points allowed per 100 possessions. Defining a “possession” is somewhat more complicated, and I’ll spare you the details (go here if you’re interested). Last year, a Sports Illustrated blogger named Luke Winn wrote a compelling examination of just how good a predictor efficiency is (the actual post seems to have moved), which he nicely summed up as follows: “From 2004-07, only two teams outside the top 49 in defensive efficiency made the Elite Eight, and zero teams outside the top 25 made the Final Four.”

OK, back to work everyone.