Thursday, August 30, 2007

Am I Being Dudded At Supercoach or What Does Champion Data Know About Aussie Rules Anyway?

One of the constant frustrations for my AFL Herald-Sun Supercoach Team has been watching a player rack up a bucketload of possessions on the weekend only to have them score pitifully for my fantasy team come Monday morning. For those unfamiliar with the great time-waster that is AFL Supercoach, you pick a team of 22 players and they score points for you based on their effectiveness. However, the actual formula for a player’s score is a mystery; instead, you are simply told that it is based on a formula devised by a mob called Champion Data. A search of the Champion Data website reveals a few further details – effective long kicks score highly, miscued kicks score badly – but the precise formula remains a trade secret.

This raises the question of whether Champion Data are rewarding or penalizing players in a fair and accurate manner. They claim that their formula has been devised using ‘research into winning and losing factors in AFL games’. This could mean that, the higher your Champion Data score, the more likely your team’s score will be higher relative to the other team (i.e. your team’s percentage will be higher) or that the higher your score, the more likely it is that your team will win, whether it be by one point or 100 points. I’m going to assume the former explanation is true, since it seems to make more sense. But how well does it work in practice?

Not that surprisingly, it tends to work pretty well. The figure below compares the actual percentage of each AFL team over the first 21 rounds of the 2007 season to their estimated percentage using their cumulative Champion Data scores. Apart from West Coast, the results are all pretty close.

Team Percentages and Cumulative Champion Data Scores Over 2007 AFL Season


Importantly, the Champion Data scores appear to perform better at predicting a team’s success than simply looking at that team’s number of possessions. The root mean squared error using the Champion Data scores (which is basically just a method of summarizing the difference between a team’s actual percentage and its predicted percentage) is about half of that which you would get if you used the number of possessions. Incidentally, the formula for the Dream Team competition on the AFL website doesn’t do much better at predicting a team’s success than possessions do.

Formula

Root Mean Squared Error

Champion Data

8.1

Possessions

15.7

AFL Dream Team

15.1



Readers of my post Win Score and the Productivity of Basketball Players will know that post also talked about a formula that did pretty well at predicting the success of a (basketball) team, but that I had reservations about how well it did at attributing this success to particular players. So how well does the Champion Data formula do in this respect? My guess is that similar types of problems apply, namely how do you account for the value of defenders who aim to prevent other players from collecting possessions rather than gathering possessions themselves, and how much credit should be given to the player who ultimately puts the score on the board? But I’m fairly satisfied that the Champion Data formula does a better job than simply looking at the raw numbers. On the other hand, if my Supercoach team loses its Grand Final this weekend, well…

Saturday, August 18, 2007

Still I Don't Know How To Get Out of Bed (AKA The Bed Song)

Note: People who have known me for a while will have seen different lyrics to this song. The reason for changing them is simple: I was 18. Those lyrics sucked. These suck less.

Another hazy morning, another half-dreamt song
We will never trust each other while something can go wrong
I can’t forget my feelings or remember what you said
Still I don’t know how to get out of bed

If the end has passed, the beginning must be near
Should I spread across the earth or build my tower here?
There’s too little that I know and too much that I’ve read
Still I don’t know how to get out of bed

Love is cool, love is fine, love is a splintered thing
Love is beyond space and time, love is adrenaline
Love is the question and the answer, but that’s all in my head
Still I don’t know how to get out of bed

Thursday, August 16, 2007

The Vines: Highly Exiled

When Sydney-siders the Vines’ debut album, ‘Highly Evolved’, was released in 2002, it shot to #3 On the UK charts and #11 On the US charts. Was this considered a notable achievement for Australian music? Some thought so, but others argued that it wasn’t really an Australian album as the band weren’t really based in Australia anymore, and the album was recorded in Los Angeles. At the Australian Recording Industry Awards for that year, the Vines picked up only one gong, losing out to George (ugh!) for best breakthrough talent, Kylie Minogue (urk!) for single of the year, and Silverchair (gag!) for best group and best rock album.

What a load of garbage! Nowadays, nothing is more Australian than getting the hell out of Australia. For any Australian under the age of 30, chances are that half the people he or she knew five years ago are on the other side of the world right now. Far from being an example of three young men turning their back on this country, ‘Highly Evolved’ is the quintessential Australian album of the new century for the very reason that it captures this tendency to pack up and ship off so brilliantly.

Not convinced? Allow me to demonstrate what I mean using a selection of lyrics from the album. Below each tidbit of Vines-speak I’ve added a translation of what Vines’ songwriter and frontman Craig Nicholls really means. Whether you’re the person who has left these shores or been left behind, you’ll recognize in Mr Nicholls’ words our national belief that the grass is always greener in the other hemisphere.

Vines-speak: Heads are down/ And all the people frown/ In the fac-to-reee (in the fac-to-ree)/ I’m so down I put my head around/ Every noose I see (Factory)

Translation: As every Vines fan knows, Craig dropped out of school in tenth grade and ended up working in McDonalds. (Of course, he then sold bundles of records, so I suppose he had the last laugh.) Anyway, bringing home that lousy Australian currency was apparently a bit dispiriting for our budding songwriter. Which leads to this conclusion…

Vines-speak: I’m gonna get freeeeeee/ I’m gonna get freeeeee/ I’m gonna get freeeeee/ Riiiide into the sun (Get Free)

Translation: Craig is outtahere! He takes along Patrick from McDonalds and some drummer and they head off overseas to make a name for themselves. It’s somewhere other than here so it must be better, right? So pick up your guitars, boys, and hit the friendly skies. Those losers back in the suburbs can eat your dust.

Vines-speak: I feel so happy/ So high-ly e-volllved … Dream-ing for something, reac-hing for somethiii-eeeeeng (Highly Evolved)

Translation: Ah yes, they can picture it now. There’s something… and, uh, well, there’s something else… and yeah, something else good… just wait until they touch down, you’ll see…

Vines-speak: I left my hooooommmmee, I left my hoooo-oooooommmmee, yeah yeah… Without my phooooonnnnee, without my phoooo-oooooonnnnee, yeah yeah (Homesick)

Translation: Proof that Craig Nicholls is different to the rest of the human race. Nobody does this.

Vines-speak: It’s 1969 in my heeeeeaaaad!/ I just wanna haaaave no plaaaace to go/ I’m living thru the sound of the deeeeeaaaaad! (1969)

Translation: Craig flew into London and the Beatles weren’t there. And it’s bloody freezing!

Vines-speak: Nothing’s gonna save you (nothing’s gonna save you)/ Nothing’s gonna save you (nothing’s gonna) out theeeerrrrrr-eeeerrrre (Homesick)

Translation: Another reference to the crappy weather in the northern hemisphere. Also Craig has realized that, due to the large amount of Aussie ex-pats in London, all the morons who beat him up at school are here as well.

Vines-speak: I have been crying in my sleep/ Cause I don’t know where I’ve been/ I just want to live to see another day/ Hey Hey Hey Hey/ Hey Hey Hey Hey/ Hey Hey Hey Hey/ Hey Hey Hey Hey/ Hey Hey Hey Hey/ Hey Hey Hey Hey/ Hey Hey Hey Hey/ Hey Hey Hey Hey (1969)

Translation: London sucks just as much as Sydney did, so Craig and company have to chant themselves to sleep. They’re coming to the realization that they chose the wrong country. So it’s time to visit that other big brother of a nation across the Atlantic. Let’s see how that works out.

Vines-speak: You know you really oughta/ Move out-ta Californyerrr (Get Free)

Translation: Not so well then. Craig’s miserable at home and he’s miserable abroad, and since time travel is an impossibility he’s going to have to find some place in this world to be. Which leads to this conclusion…

Vines-speak: I really don’t need a chaaaaaa-nge/ I really don’t need what’s miiiiiii-ne/ Out in a country yaaaa-rrrrd/ It-’ll be just fine (Country Yard)

Translation: Of course, when life in the city gets too much, you can always take a sea change into the Aussie country. Craig dreams of going the time-honoured ‘tortured genius’ route and becoming a virtual recluse. But won’t that tie him down? How can he fulfil his yearning to escape? We all know the answer…

Vines-speak: Why should I lose/ When I-‘ve got to goooo/ Maaar-y Jaaaaane/ Maaar-y Jaaaaane (Mary Jane)

Translation: In the end, there are always the drugs.

Tuesday, August 7, 2007

Win Score and the Productivity of Basketball Players

In The Wages of Wins: Taking Measure of the Many Myths in Modern Sport, authors David Berri, Martin Schmidt and Stacey Brook construct a metric for determining the productivity of each player in the National Basketball Association, known as Win Score. The Win Score formula is as follows:

Win Score = Points + Possession gained (rebounds, steals) – Possession lost (turnovers, field goal shots, ½ free throws) + ½ Offensive help (assists) + ½ Defensive help (blocks) – ½ Help opponent (fouls)[1]

They find that this metric does a pretty good job at predicting the number of games that a team will win over the course of the season. A strong point of this method is that it recognizes that the players that score the most points are not always the most productive. For example, a player who scores 30 points but makes only 25 per cent of their field goals is not a very efficient scorer, and may well have harmed their team’s chance of winning. However, the formula has come in for criticism for allegedly favouring rebounders over scorers. Under the formula, a top scorer like Allen Iverson is considered to be only an average player, while rebounding king Dennis Rodman is considered to be a superstar on par with Michael Jordan. Now I don’t subscribe to the theory that scoring should be seen as an inherently harder skill to master. As far as I am concerned, a player can be productive through being either a dead-eye shooter, a monster on the boards or a pin-point passer. My main concern is that the Win Score doesn’t weigh up these attributes equally. To see why, imagine a simplified version of basketball in which two players, A and B, take turns shooting from a designated spot on the court. After each shot, two other players, C and D, go up for the rebound. If C rebounds the ball, A takes the next shot, and if D rebounds the ball, B takes the next shot. If A makes a shot, B takes the next shot, and vice versa. Let’s say the statistics at the end of the game are as below:

Player A: 0 for 6, 0 points
Player B: 2 for 3, 4 points
Player C: 4 rebounds
Player D: 3 rebounds
Team E (Players A and C): 0 points
Team F (Players B and D): 4 points
Team F wins.

Now under Win Score, Player C would be the most productive player (4 points), followed by Player D (3 points), Player B (1 point) and Player A (-6 points). However, I would argue that Player B is clearly the most productive player. The reason that Team F has won is because Player B was much more effective at shooting than Player A. If A and B had shot with the same accuracy:

Player A: 2 for 6, 4 points
Player B: 1 for 3: 2 points
Player C: 4 rebounds
Player D: 2 rebounds
Team E: 4 points
Team F: 2 points
Team E wins

In this case, Player C is the most productive player, and Player D is the least productive. This is because, with A and B being equally effective at scoring, C’s extra rebounds are responsible for getting Team E over the line. (Under Win Score Player C would still be the most productive, but Player D would be more productive than Players A and B.)

Or to put it another way, if each team performs each skill equally well, nobody should win or lose. The difference between winning and losing is determined by differences in skill levels.

Now of course in a real game of basketball all players are able to both score and rebound, but the key point still remains that a player’s ability to perform a skill should be measured relative to everybody else’s ability to perform that skill. But with Win Score a player’s ability to rebound is overvalued relative to a player’s ability to score. Berri and company do calculate a Position-Adjusted Win Score, which could arguably take account of this fact, but that limits a player in a particular position to a particular role when in reality they may have another role in which they help their team (e.g. a point guard may be an efficient shooter rather than an efficient passer). Another argument is that Win Score does account for a player’s ability to score relative to other players’ abilities simply because it includes both points scored and field goals missed. However, that assumes each player takes their shots from a designated spot and with the same amount of defensive pressure. Allen Iverson might have made 40 per cent of his shots and Dennis Rodman might have made 55 per cent of his shots, but Iverson may still be a more efficient scorer because if Rodman had the same role as Iverson and was facing the same defensive pressure he might have only made 20 per cent of his shots. Or to put it another way, if Rodman had been a shooter of average skill he might have made 70 per cent of the shots that he had, and he hurt his team by only shooting at 55 per cent. Win Score does not account for this possibility.

Another problem is that Win Score gives full credit to the rebounder for gaining a possession and attributes full blame to the shooter for losing a possession. The latter is probably not so much of a problem because if a shooter should not always receive full blame for losing a possession (for instance, if the rest of his team is standing around twiddling their thumbs and he’s playing one-on-five) he probably should not always receive full credit for scoring the basket either (for instance, if the power forward sets a bone-shattering screen on the defender). But again this is going to overvalue players who happen to be very good at rebounding at the expense of the players who are very good at other skills.

It is difficult to see how any metric could address either of these problems. The first is hard to solve because we do not know the counterfactual while the second is hard to solve because it relies on intangibles. The Win Score metric is on the right track to working out the key factors between winning and losing. The tricky part is deciding who is truly responsible.[2]


[1] This method for writing out the formula was actually used in a power point presentation by Joe Price and Justin Wolfers.
[2] By my reckoning, the final point margin is determined entirely by a team’s ability relative to the other team to create scoring opportunities and the team’s ability relative to the other team to capitalize on scoring opportunities. The first term would incorporate rebounds, steals and turnovers, while the second term would incorporate shooting percentages and assists. To work out a player’s productivity you would have to work out that player’s contribution to each of those two terms. Er, good luck.