So another year of fantasy basketball is winding to a close. Maybe your team got pounded by injuries; maybe your team had Dirk, Nash, and David Lee and cruised to victory (like mine). There are many different methods out there to look at and evaluate player performance, and there are lots of ranking systems. Sure LeBron was obviously #1, but what about down the list? Do you really trust those pre-rankings? Today I’m going to talk about a method of evaluating the numbers, so hopefully during next year’s draft you can use your 90 seconds scrambling for injury and team information while having some confidence in the numbers to expect.

We are already quite comfortable with using averages in sports stats. LeBron scored 29.9 points per game. Dwight grabbed 13.6 rebounds per 36 minutes. So instead of jumping to PER or RAPM or some other complex analysis, why not go to just the next step with standard deviation? In fantasy, we have the entire population (all players that have logged minutes in an NBA game), and all we really care about is choosing the guy that is better than what the other teams have. Standard Deviation could fit this need!

Well let’s not get ahead of ourselves. For example with Yahoo!, you get all the raw number totals and averages, and even their special “O-Rank” and “Rank”. Why expand beyond that? Well the problem is, when you sort by FG%, or TOV, things start looking strange. Is Marc Gasol’s 58.3% going to help your team more than David Lee’s 55.2%? Just how bad is Dwight Howard’s 60.3% FT shooting going to kill that category? Kinda hard to tell by eyeballing it. Even with the raw numbers: just how much will having Steve Nash on my team dominate the assists category?

Enter: Standardize. If you really don't want to do math, then I’ve still got good news for you: this is all done by ESPN’s Fantasy Basketball Player Rater. In fact, if you are quite satisfied using just the ESPN Player Rater, you probably can stop reading the article now. Here you can see all the Standard Scores in each category, and they are added up to make the final column as a composite score (yes, this makes more sense than adding percentages together randomly *coughHOLLINGERcough*).

For example: LeBron has scored 2033 points as of this post. The league average is 472.5 and the standard deviation is 410.25 pts. So (2033 – 472.5) / 410.25 = 3.8. Meaning LeBron is 3.8 standard deviations above the league average. For those not familiar with standard deviations, a score of 1 puts you above ~84.1% of the population, 2 puts you ~97.7% above, and 3 puts you ~99.9% above, and 4+ is outstanding. Isn’t this what you really want to know on draft day? You can find overall contributors with a glance, and see what needs you are lacking and pickup specialists without having to guesstimate the raw numbers.

Another benefit of Standardizing is the use of negative standard deviations, so you can see when a player is really hurting your team!

Okay so the bad news here is ESPN only shows the 8 categories. If you’re playing with TOVs, how does that fit in? Also, how do I calculate the FG% and FT% numbers since they aren’t raw numbers?

Well here’s where we start doing things for ourselves. Pickup your favorite script of choice, or start copying and pasting CSV text from basketball-reference/your favorite website. Now then, turnovers are easier: since it works as a negative statistic, I simply found all the Standard Scores then changed the signs.

For FG% and FT%: I personally believe ESPN doesn’t give enough weight to the amount shot. Shouldn’t LeBron shooting 50.0% at 20.2 FGA/g have more impact than Varejao shooting 57.1%, but only 6.4 FGA/g? Well I think so, which is why I normalized first, then weighted by shots taken before dividing by standard deviation. My FGscore is defined as:

And I standardize the FGscore (average is already zero, so really I’m just dividing by standard deviation). So LeBron ends up with 2.37, and Varejao with 2.07, not that anyone would think of drafting the latter over the former. But in any case, now we can properly rank players by their FG%, so all the lacktators with 1.000% FG% filter to the bottom.

Same thing with FT%: Is Nash's 94.1% (2.7 FTA/g) or Carmelo's 83.1% (9.3 FTA/g) helping you win the category more? ESPN puts Nash over ‘Melo, but using my FTscore, Carmelo scores a 2.67 while Nash scores a 2.36. Of course, Durant and Dirk still dominate the category.

Okay, I’ve been far too positive towards ESPN. This sounds almost too good to be true. What are the limitations of this method? Like I said, Z-Score happens to work well since we have the entire population of data. However, a simple glance at the data will show you that we are NOT working with normalized data, one of the assumptions in Standard Scores! Going one step further, I looked at the skew and kurtosis of each category, and they are off the charts, with the worst skew on blocks at 2.2 and kurtosis on FT% at 16.81.

In simpler terms, this means some standard deviations at the far ends may be inflated more than they should be. For example, Dwight gets a near 6 score in blocks, which statistically should not happen in only ~450 people. It’s like one in a million. So as with all advanced statistics, use them carefully!

In addition, I did a Principal Component Analysis (PCA) on the 9 factors. Turns out there’s such a strong negative correlation between Points and Turnovers, and modestly strong correlations between Points and other categories, it’s not even worth bothering looking at the TOV category! Stupid Yahoo!!

So maybe you hate Turnovers. Maybe you hate my FGscore and FTscore. I think it’s also perfectly valid to try and dominate the 6 raw stat categories! It’s very intuitive, and any wins in FG%, FT%, or TOV is just gravy. In fact, I’ve done just this...

Putting it all together: We can analyze total season numbers, per game numbers, or per (36) minute numbers. I’ve proposed looking at standard scores of the 9 categories like Yahoo!, the 8 categories like ESPN, or the 6 raw stat categories. Well since we’re already working with Standard Scores... why not just add composite scores together? I did this with the completed 2008-09 data. So total season numbers help show how much a player contributed, per game numbers account for some injuries and such, and per minute numbers account for varying playing time. Since they’re all standardized now, I just add them together to get a super-composite score, for a really quick look at who did the best (i.e. which players I should

After doing all this, and comparing it to 9, 8, and 6 category, it turns out there’s lots of correlation among them, but the analysis that made overall sense was... 6 category! Are you serious?! After all that work I did messing with TOV and FG% and FT%, you could essentially ignore them?

Well sorta. Mostly, it’s

Another way to avoid this, and possibly help further normalize the data: just take the top 200 player, or top 100, or whatever, and treat that as your total population, because lets be honest: no one’s putting Mario West or JamesOn Curry on their fantasy teams. Hell, to simplify things, take the top 4 players and do a pretend 2 person draft. From there, you can see what categories you’re taking, which you’re giving away, and which analysis to use.

I hope this gave you some insight into stats and fantasy basketball. Of course, when it comes to injuries and rookies etc., you’re still on your own. This method I presented is highly useful to roto or h2h, and can be expanded or contracted at your liking, despite its limitations. Want to look at the past 3 years combined? The past month only? Go for it. Don’t just trust those pre-built rankings anymore, grab your favorite programming language/spreadsheet/abacus and find those undervalued and steal picks!

Other random notes: Yes, I graphed a ton of stuff while doing this. Tips I picked up:

- With the top picks, don’t over-value 3pt shooting. It is easy to pick that up later in the draft, or with waivers.
- As I implied before, FG% and FT% is pretty even down the board. Use Standard Scores to slightly suggest one guy over the other. e.g. If you pick up Dwight Howard, concentrate on FG% guys cause there’s probably no combination of players you can pick up to make up the FT% column.
- Blocks are sparse, but spread out down the board.
- Of the remaining stats, Steals and Points have the strongest correlation, and Steals is usually a close category (so every little standard score counts!). This is probably why drafting bots tend to eat up all the point guards early.

And finally, in good Basketbawful fashion, how does my ranking compare to ESPN’s for the worst fantasy player of the season so far?

-6.05 Primoz Brezec

-6.05 Jarron Collins

-6.01 Kwame Brown

-5.98 Eddy Curry

-5.97 Lindsey Hunter

For reference, Mario West has a -5.54.

Labels: fantasy sports, guest author, standard deviation, statistics, ugh this sucks I'm just going to use ESPN's Player Rater

Word of the Day

Previous Posts

Bawful After Dark: Weekend WatchWord of the Day: home castin'

Man Love Denied: Starring Josh Smith and Mike Bibb...

30 Reasons This Season Kinda Sucks: Part 1

Worst of the Night: March 16, 2010

Worst of the Night: March 15, 2010

Mike Dunleavy Sr.: A Clippers coaching career in f...

Poster Boy Classics: Starring Caldwell Jones

Worst of the Night: March 10, 2010

Bawful After Dark: March 9, 2010

Links

Archive

January 2005February 2005

October 2005

November 2005

December 2005

January 2006

February 2006

March 2006

April 2006

May 2006

June 2006

July 2006

August 2006

September 2006

October 2006

November 2006

December 2006

January 2007

February 2007

March 2007

April 2007

May 2007

June 2007

July 2007

August 2007

September 2007

October 2007

November 2007

December 2007

January 2008

February 2008

March 2008

April 2008

May 2008

June 2008

July 2008

August 2008

September 2008

October 2008

November 2008

December 2008

January 2009

February 2009

March 2009

April 2009

May 2009

June 2009

July 2009

August 2009

September 2009

October 2009

November 2009

December 2009

January 2010

February 2010

March 2010

April 2010

May 2010

June 2010

July 2010

August 2010

September 2010

October 2010

November 2010

December 2010

January 2011

February 2011

March 2011

April 2011

May 2011

June 2011

July 2011

August 2011

September 2011

October 2011

November 2011

December 2011

January 2012

February 2012

March 2012

April 2012

May 2012

June 2012

July 2012

August 2012

September 2012

October 2012

November 2012

December 2012

January 2013

February 2013

March 2013

April 2013

May 2013

June 2013

July 2013

August 2013

September 2013

October 2013

November 2013

December 2013

contact: basketbawful@yahoo.com

Also, if I had to guess two of the worst five fantasy players, I would have probably said Eddy Curry and Kwame Brown. This pleases me to no end.

Primoz Brezec FTL!

Is Lindsey Hunter even still playing at this point?

Say there's Team A and Team B. Both teams have 2 players, one shooting 25% and one shooting 75% each, just like Yahoo! tells us. Who wins the FG% category?

Well the answer is not enough information. Just comparing 25% and 75% to the league average 50% seems to imply the teams are equal, but this may not be the case. Both teams probably did NOT shoot 50%.

Say Team A has someone who shot 300/1200 (25%) and 300/400 (75%). Team B has one who shot 150/600 (25%) and 450/600 (75%). Now which team wins the FG% category?

The answer is team B, shown by adding the categories together. Team A shot 600/1600 (37.5%) and Team B shot 600/1200 (50.0%). Using my FGscore method, Team A had scores of (.25 - .50)*(1200) = -300, and (.75 - .50)*(400) = 100, while Team B had -150 and 150. Adding them together, -200 is less than 0, so Team A was under average and Team B was spot on. Find the FGscore of everyone in the league, get the stddev, divide, and move on.

If anything, from all this, even though the ESPN player ranker is fairly solid, I hope I raised enough thought in its disadvantages, and no one ranking system should be outright trusted. I should have added that yes, it is biased towards fellating LeBron James's fantasy impact, as is standard practice at ESPN. (Note how limited the scores can reach on the negative end, around -6, yet LeBron can get a 17+ score. Skew-tastic!)

Are we assuming a normal distribution? Also, how does the median compare with the mean? Wouldn't we expect more people to be clustered at the bottom? I think that's a poisson distribution, but I only know the name, not how to use it...I would link to wikipedia, but it's seriously dense and technical.

I was thinking you could measure individual players' consistency using standard deviations -- let's say Dwight Howard gets an average of 13.1 rpg, but the standard deviation is 4, whereas Carlos Boozer gets 11.2 rpg but the standard deviation is only 1.8 (I'm making the standard deviations up completely, btw). Wouldn't Boozer more valuable because of consistency? You just win categories, you don't get extra for margin of victory...does this make sense?

relativelynormal. (Not exactly, but close enough)If you looked at the basketball talents of the entire world, the NBA players would surely rate much higher than most random high schoolers and low level eastern European team players, etc. You would expect a fairly skewed distribution then. However, looking at just the players who are good enough to make it to the NBA, I would assume that there's a lot of "mediocre" talents in the Association("average" compared to other NBA players, but far and above 99% of the world's population of basketball players). There are several "good" players, and several "bad" players. And of course you have the outliers, such as LeBron James (who is far superior to most NBA players) and of course the extremely untalented by NBA standards lacktators, many of whom are

alsooutliers. But outside of these two far ends of the spectrum, I would think the distribution would be fairly normal. A lot of average guys, a few below average, a few above average, and a handful of really bad and really good players.talentthe distribution could be normal, but in terms of say, scoring, it isn't. "Normal" is a fancy way of saying bell curve -- and the per game statistics are skewed pretty hard towards the bottom, so it's not a bell curve.If scoring was distributed normally, according to the data in the post, you'd expect 2.3% of the players to have scored -348 points or less this season...and as hard as some of them try, you can't really score negative points.

someright-skewing to a total points scored charting of all NBA players, mostly thanks to the bench players getting less time to score, and the revolving door of garbage time guys.If you must know , here's what I got for 09-10 upto today:

Skew Kurt. Names

==== ===== =====

0.98 6.41 FG%

1.30 3.80 3P

-0.33 16.81 FT%

1.25 4.25 TRB

2.18 9.28 AST

1.06 3.88 STL

2.20 8.90 BLK

-1.00 3.60 TOV

0.97 3.51 PTS

Also, individual standard deviations would help H2H teams more than roto, but there's so many other factors to week-to-week performance, like number of games, home/away, days rest, etc., that it's not really as helpful as you think.

http://farm5.static.flickr.com/4028/4455139086_58c702356c_o.jpg

Thought that was a nice coincidence. :)

I think that may be the same technique, only using the past 3 games of course, and scaled from 0 to 1...

Victor- Simply add together the scores from whatever category you want to count up.BTW, since this post I've expanded the scores to look at popular punt categories, such as FG and TO (Granger), FT and TO (Howard), steals (Lee), or blocks.

for example:

total score = 0.5*points + 0.8*rebounds - 2*TO

I'm wondering if you've come up with a formula to determine the coefficients for each category.