open-ended projects & march madness

By krhendrickson

June 24, 2022

Before I ramble too much, click here for the live dashboard.

While I had done a bit of data visualization in my college years, my real data-viz-in-R journey was spring-boarded by a class I took in the first year of my masters at Duke (Data Visualization in R, if you’ll believe it, taught by Dr. Eric Green). We had an open-ended final project to close out the course.

When I was a lil engineering undergrad, I dreaded open-ended assignments, and rarely had them. Engineering isn’t often an open-ended endeavor, at least from an academic and professional point-of-view (sure, there are the star kids and their creative genius start ups, but we don’t like those kids. they stress the rest of us out). For engineering problems, there’s usually a defined end goal, along with an established method for reaching that goal. While the material can be tortuously difficult, you’re comforted by a clear understanding of the guidelines and format for your repeated failures.

But then I went to a 9-month UX/UI design bootcamp. Upon reflection, I’d say that developing a toolbox to handle and appreciate open-ended assignments was one of the best things I learned as a design student. I was immensely grateful for that toolbox when I embarked on this data viz final.

UConn and Tenessee are both dominant basketball programs when measured by championship wins.

I wanted my project to be, as the intellectuals say, ‘non-sciencey’. While in a global health program, it can feel like your whole life is revolving around the rates of death from X disease in Y country - or other topics of stomach-churning gravity. When given the option, I wanted to take my friend Maya’s advice and “pick something like the distribution of Skittle flavors”. I spent many afternoons browsing online datasets, waiting for one to speak to me (this is tool no. 1 in the open-ended project toolbox: don’t force inspiration). I could not tell you why I landed on March Madness data from women’s collegiate basketball, except that my professor is a dedicated basketball fan and you can’t underestimate the effect of warm fuzzy familiarity on your grade.

I also decided that I would construct a reactive dashboard with this dataset. There was no practical use-case in mind for that idea - and I probably should have had one - other than me wanting the satisfaction of making a working button. I crossed my fingers that it wouldn’t come up during the final presentation.

But if you plot appearances in final four games, UConn’s success is a lot more recent than Tennessee’s.

Making this dashboard was both easier and harder than I thought it would be; a trend that I have found consistent across many coding projects. What happens (to me) is this: a lot of libraries make the basic setup of a visualization quite straight forward, so to get the bare bones of something working maybe takes 45 minutes. (“Wow! I’m so good at this!”) But then, I go about customizing my visualization in ways that I believe should also be quite straight forward. Suddenly, moving one label 2-inches to the right gets me 5 hours and 17 stack overflow posts into the debugging wilderness, and I’ve hardly progressed at all. (“How can no one have ever posted about this problem before?!”)

This is when my prof would tell me to make my own Stack Overflow post, but I have stubbornly ignored this advice. Five hour debugging sessions are where I’m comfortable, thanks, efficiency be damned.

A comparison of win rates between the three colleges in the triangle area: Duke, UNC, and NC State. I only plotted this because it made Duke look good. Also it gave me a reason to use a slider.

As an example, in the fourth tab of the dashboard, I wanted to make a line graph that filled in one color for when Duke was above average in tournament win-rate and fill in another color for when it was below average in tournament win-rate. The idea here is to have a quick visual indicator for which years Duke had a decent March Madness run. This might make more sense when you see it (graph below). Figuring out how to get the colors to fill in correctly took me as much time as the rest of the entire dashboard combined, but I did get there in the end. I remember conducting my quiet celebration in one of the engineering buildings on a Sunday evening.

Again, probably would have scratched this idea if it had been mostly red. You can’t make a presentation on Duke basketball sucking and expect to pass your classes, folks. (unless you’re reading this from UNC. then, carry on).

The final hurdle for this baby came when I wanted to publish the dashboard online. I spent a solid chunk of class trying to make a public GitHub page before my professor informed me that what I was doing made no sense. Turns out you have to publish through the Shiny servers if you’re using the Shiny package, or whatever. Humph.