NCAA Alumni in the NFL

Click either image for the fullscreen version.

This was inspired by a recent Huffington Post article showing the current breakdown of NFL players by college attended. I thought it was really interesting, but wanted to add a little historical context to what they showed. Rather than just the 2014 season, I wanted to go back in time (to 1990) and see who the biggest overall feeder colleges were, and how representation evolved. For the sake of keeping the graph uncluttered, I am only showing the top 20 schools (in terms of alumni player-years in the NFL) over the time period.

I used the school colors instead of a real colorscale like colorbrewer or any of the d3 ordinal color scales; this is really bad visualization theory but I like that you can see LSU purple or Tar Heel blue and know immediately what school it is. If you are really interested, a version with a proper colorscale can be found here.

There aren't a ton of surprises in regards to the list; but it is interesting to see some of the movers and shakers over time (the dwindling representation of Notre Dame and Michigan, and the increase of LSU and Texas). I'm also surprised Miami has such strong NFL representation even recently as they've fallen a bit as a program compared to the early 2000's. Notable differences from my list and the HuffPo list include Oregon and Oklahoma, who are high on active players but don't have quite enough cumulative players over the 24 year span to crack this list.

Another note, the 2014 numbers are so low because this data covers ALL rostered players, there is quite a bit of roster churn throughout the season and it's not unusual to see 70+ names for a team throughout a given season historically; right now for 2014 we just have the opening day rosters. You can also see the expansions when the Panthers/Jags and Texans came into the league, pretty cool.

My approach:

First, I grabbed a bunch of data of the form year,school,count representing the number of players from that school in the NFL that year.

This has a large amount of holes where colleges may not have had representation for a given year, and d3's stack layout isn't going to like that (it needs data for every key for every time period you try to plot). I filled in the key space for all school/year combos in Python:

# HOW IS THIS NOT IN THE STD LIB??? SO USEFUL
def increment(d,key,val=1):
    if key in d:
        d[key] += val
    else:
        d[key] = val


with open('raw_file.csv', 'r') as filein:
    all_schools = {}
    schools_by_year = {}  # dict of dictionaries

    first = True
    for line in filein:
        if first:  # skip header
            first = False
            continue

        parsed_line = line.strip().split(',')
        year = parsed_line[0]
        school = parsed_line[1]
        count = int(parsed_line[2])

        increment(all_schools, school,count)
        if year not in schools_by_year:
            schools_by_year[year] = {}
        schools_by_year[year][school] = count

# now loop through all schools and fill by year
for school in all_schools:
    for year in schools_by_year:
        if school not in schools_by_year[year]:
            schools_by_year[year][school] = 0

# write out the new file...

I wrote out both counts, for the top 20 schools: school representation by year, and total count overall for a school. Also interesting would have been the schools with the biggest deltas in the list, but that will have to wait for another time.

Now some d3. Since I'm not going to host this as live JavaScript on the site, I can just throw everything in one file, nice and quick, like they do over at bl.ocks.

The HTML:

<!DOCTYPE html>
<head>
    <meta charset="utf-8">
    <style>
       SEE CSS BELOW
    </style>
</head>

<body>

    <script src="http://d3js.org/d3.v3.js"></script>

    <div class="chart"></div>
    <div class="barchart"></div>

    <script>
        SEE SCRIPT BELOW
    </script>

</body>

The CSS:

body {
  font: 15px sans-serif;
}

.chart, .barchart { 
  background: #333333;
}

p {
  font: 12px helvetica;
}

.axis path, .axis line {
  fill: none;
  stroke: #bbb;
  stroke-width: 2px;
  shape-rendering: crispEdges;
}

text {
    font-size: 16px;   
    stroke: none;
    fill: #bbb;
    font-family: Arial;
}

.bartext {
    fill: #ccc;
    font-family: Arial;
    font-weight: bold;
    font-size: 22px;
    stroke: #222;
}

.layer {
    stroke: #555;
}

(I actually used Open Sans, but Arial is nice and basic).

For my JS/d3, my inline script defines some global variables so I can hard-code the school colors and the y-positions of the streamgraph text annotations, then calls two wrapper functions to plot the graphs. My streamgraph file is csv with date, key (school), and value (# players), the bargraph file is just school and playeryears.

If there's one neat d3 trick in here, it would probably be the property setting through a key/value access rather than by index as you usually see in examples.

var colors = {
    'Miami (FL)':'#f47321',
    'Florida State':'#990000',
    'Notre Dame':'#C5B358',
    'Ohio State':'#FF2000',
    'Southern California':'#990000',
    'Tennessee':'#f77f00',
    'Florida':'#FF4A00',
    'Penn State':'#162952',
    'Michigan':'#f5d130',
    'Georgia':'#FF0000',
    'Louisiana State':'#461D7C',
    'Nebraska':'#FF2400',
    'North Carolina':'#6699CC',
    'California':'#191970',
    'Auburn':'#dd550c',
    'Texas':'#CC5500',
    'UCLA':'#536895',
    'Texas A&M':'#500000',
    'Alabama':'#FF0000',
    'Washington':'#363c74' 
};

var ypositions = {
    'Miami (FL)':17,
    'Florida State':50,
    'Notre Dame':90,
    'Ohio State':130,
    'Southern California':160,
    'Tennessee':190,
    'Florida':230,
    'Penn State':265,
    'Michigan':300,
    'Georgia':340,
    'Louisiana State':370,
    'Nebraska':400,
    'North Carolina':430,
    'California':455,
    'Auburn':483,
    'Texas':505,
    'UCLA':526,
    'Texas A&M':550,
    'Alabama':580,
    'Washington':606
};

chart("full_output.csv");
barchart("school_ranking.csv");


function chart(csvpath) {

var margin = {top: 30, right: 50, bottom: 80, left: 80};
var width = 1100 - margin.left - margin.right;
var height = 800 - margin.top - margin.bottom;

var x = d3.time.scale()
    .range([0, width]);

var y = d3.scale.linear()
    .range([height, 0]);

var xAxis = d3.svg.axis()
    .scale(x)
    .orient("bottom")
    .ticks(d3.time.years);

var yAxis = d3.svg.axis()
    .scale(y);

var stack = d3.layout.stack()
    //.offset("wiggle")
    .values(function(d) { return d.values; })
    .x(function(d) { return d.date; })
    .y(function(d) { return d.value; });

var nest = d3.nest()
    .key(function(d) { return d.key; });

var area = d3.svg.area()
    //.interpolate("cardinal")
    .x(function(d) { return x(d.date); })
    .y0(function(d) { return y(d.y0); })
    .y1(function(d) { return y(d.y0 + d.y); });

var svg = d3.select(".chart").append("svg")
    .attr("width", width + margin.left + margin.right)
    .attr("height", height + margin.top + margin.bottom)
  .append("g")
    .attr("transform", "translate(" + margin.left + 
        "," + margin.top + ")");

svg.append("text")
    .attr("x", 30)
    .attr("y", 10)
    .style("text-anchor", "center")
    .text("Number of Players On NFL Rosters " + 
        "by University Attended and Year");

var graph = d3.csv(csvpath, function(data) {
  data.forEach(function(d) {
    d.date = d3.time.format("%Y").parse(d.date); 
    d.value = +d.value;
  });

  var layers = stack(nest.entries(data));

  x.domain(d3.extent(data, function(d) { 
      return d.date; }));
  y.domain([0, d3.max(data, function(d) { 
      return d.y0 + d.y; })]);

  var streamgroup = svg.selectAll(".layer")
      .data(layers)
    .enter().append("g");

  streamgroup.append("path")
      .attr("class", "layer")
      .attr("d", function(d) { 
          return area(d.values); })
      .style("opacity", 0.75)
      .style("fill", function(d) { 
          return colors[d.key]; });

  streamgroup.append("text")
      .attr("class", "bartext")
        .attr("x", width * 0.54)
        .attr("y", function(d) { 
            return y(ypositions[d.key]); })
        .text(function(d){return d.key;});

  svg.append("g")
      .attr("class", "x axis")
      .attr("transform", "translate(0," + height + ")")
      .call(xAxis)
   .append("text")
      .attr("x", width / 2)
      .attr("y", 50)
      .style("text-anchor", "center")
      .text("Year");

  svg.append("g")
      .attr("class", "y axis")
      .attr("transform", "translate(" + width + ", 0)")
      .call(yAxis.orient("right"));

  svg.append("g")
      .attr("class", "y axis")
      .call(yAxis.orient("left"));


});
}


function barchart(csvpath) {

var margin = {top: 30, right: 50, bottom: 80, left: 80};
var w = 1100 - margin.left - margin.right;
var h = 800 - margin.top - margin.bottom;

var x = d3.scale.linear()
    .range([0, w]);
x.domain([0, 1000]);

var xAxis = d3.svg.axis()
    .scale(x)
    .orient("bottom");


var graph = d3.csv("school_ranking.csv", 
  function(data) {

    data.forEach(function(d) {
        d.playeryears = +d.playeryears;
    });

    var barheight = h / data.length;

   var svg = d3.select(".barchart").append("svg")
        .attr("width", w + margin.left + margin.right)
        .attr("height", h + margin.top + margin.bottom)
      .append("g")
        .attr("transform", "translate(" + margin.left 
              + "," + margin.top + ")");

    svg.append("g")
        .attr("class", "x axis")
        .attr("transform", "translate(0," + h + ")")
          .call(xAxis)
        .append("text")
          .attr("x", w / 3)
          .attr("y", 50)
          .style("text-anchor", "center")
          .text("Total Number of NFL Rostered " +
              "Player-Years, 1990-2014");


    var bar = svg.selectAll(".barchart")
        .data(data)
      .enter().append("g");

    bar.append("rect")
        .attr("class", "bar")
        .attr("x", 0)
        .attr("width", function(d){ 
            return x(d.playeryears); })
        .style("opacity", 0.75)
        .attr("fill", function(d){ 
            return colors[d.school]; })
        .attr("y", function(d, i) { 
            return i * barheight; })
        .attr("height", barheight);

    bar.append("text")
        .attr("class", "bartext")
        .attr("x", 10)
        .attr("y", function(d, i) { 
            return i * barheight; })
        .attr("dy", "1.2em")
        .text(function(d){
            return d.school + ": " + 
            d.playeryears;});

});

}

Then I just saved out the graphs as .png files, and there you have it.