Having recently built a Sports Payrolls data viz comparing payrolls across the major U.S. sports, I decided to drill down on a specific sport. Being a big baseball fan, I decided to pull in 2012 baseball salary and payroll data. I also pulled in the last ten years of payroll data by team to compare total payroll, average, median, and standard deviation.
One of the things that surprised me the most is just how difficult it is to get all this data. I was going to different sports sites like ESPN, USA Today, and CBS Sports to get everything that I needed. Each site had the data displayed a bit differently with more or less detail depending on the site.
I ended up using data from both CBS Sports and USA Today; however, I was really surprised that CBS Sports didn't give me the position of each player with the baseball salary information by team. I needed to merge it with another data set from the same website!
It's just too bad that Sports websites don't offer a download data feature. I ended up copying and pasting data from all 30 MLB teams into Excel, cleaning up the data, then joining the data in Access to bring in one dataset that would give me player details, 2012 salary, and position. It was quite an exhaustive process! And it took much longer than I thought it would.
I really wanted to display all of the Major League Baseball teams on a map. Only one baseball team is not in the United States, that is, the Toronto Blue Jays. It was neat to be able to see all 2012 payrolls by team plotted on a map. Not surprisingly, the New York Yankees have the highest payroll at $198M. And if you look at the top five paid baseball players, they make up three of the top five!
One tricky thing with the data is that it was very incomplete. Many baseball players, about fifty-eight in my list did not have any salary data. In the salary field, it said "Not Available" on CBS Sports' website. I ended up excluding them from the viz since it didn't really make sense to show that they had a zero salary which is not true. Some have been called up from the minors, others are in arbitration and others just do not have any salary data released.
What I found really interesting is that the difference between the highest paid player and the lowest paid player is huge! The highest paid player is Alex Rodriguez at $30M/year with the lowest salary at $480K. That seems to be the default salary for many baseball players.
I added in additional drill-downs to be able to filter by player, position, age, batting (left, right or switch hitter), and throwing (left or right). With batting and throwing, there were some players that didn't have any data listed, hence, the N/A in the drop-down box. Even that data wasn't fully populated!
Being able to see the top paid players by these filters was quite surprising! Here's what I found:
Highest youngest paid player with a listed 2012 salary is Yoenis Cespedes making $9M at the ripe age of 20! There's another player at age 19, but he didn't have any salary information so he does not appear in my viz.
Oldest paid player is Omar Vizquel at age 45 making $750K this year.
Highest paid pitcher is Johan Santana making a little over $23M this year with CC Sabathia close behind.
Highest paid relief pitcher is Mariano Rivera at close to $15M this year. Too bad he's injured and out for the season!
Joe Mauer is the highest paid catcher this year at $23M. Buster Posey is a steal at $615K. Had to call out a Giants player!
Highest paid DH is Michael Young of the Texas Rangers making $16M at the age of 35. Nice!
Who's the highest paid switch hitter? It's another Yankee, Mark Teixeira, making over $23M this year.
Highest paid thirty-year old is Adrian Gonzalez of the Boston Red Sox at close to $22M.
Barry Zito is the highest paid San Francisco Giant at $19M this year. Wow ...
Texas Rangers have the highest median payroll this year at $3.44M. Colorado Rockies are the lowest at $0.48M.
New York Yankees had a total payroll less than $100M back in 2000 at $92M. Every year after has been over $100M!!
Feel free to check out the visualization and interact with the data. Check out your favorite team to see how their payroll has stacked up against other teams in the last ten years.
Probably the best part about constructing this viz was being able to show the data visually. It becomes so much easier to explore and it's so much more interesting than just looking at a bunch of numbers.
Hope you enjoy this visualization and please feel free to leave any comments. Thanks!