The Ivory Sofa

A Data Visualization Blog by Kyle Biehle (on twitter @kbiehle2)

Wednesday, March 20, 2013

Is There Life After Fifty for a Songwriter?



The plan was never to be a data guy.  The plan was to be a professional musician and songwriter. In my twenties I was in a rock band called Helen Keller Plaid.  We recorded a couple of albums, played a ton of shows, and did one tour in the early 90's. We got some favorable press, developed a solid local following, and were getting enough college radio airplay across the states to merit quarterly royalty checks in the low two-figures. But when our record label went bankrupt after the release of our second album, we took it as a sign and hung it up.  After four years of trying to "make it" we succumbed to competing pressures:  jobs, relationships, the need to start families and "earn a decent living".   The college degree I secured a few years prior was supposed to just be "something to fall back on".  Eventually "something to fall back on" replaced "something to lean into".    

As a musician - particularly a rock musician - the accepted belief is that your twenties are when you have to get it done. You're vital,  angst-riddled, desperate to be heard, and look good shirtless. It also seems to be the case that most rock songwriters do their best work in their twenties.   That's why it hurt to turn thirty.  Not unlike the glass ceiling, turning thirty to a rock musician is the glass green-room door closing.  I was working a 9 to 5 job, Helen Keller Plaid was two years gone and I was starting over with a new band that I knew didn't have any real magic in it.  It's when I faced the harsh reality that I probably wasn't going to be living in hotel rooms and on tour buses for the rest of my days.  Over the next decade, music took a seat in the way-back. Musical aspirations still smoldered, but commitment and output all but died.  By the time I turned forty, I was in a good place.  I was happily married and my first child was three months old.  And the turning of that decade wasn't tied to any self-imposed musical timeline. Turning forty was painless. 

So now I'm approaching fifty,  twenty years post musical identity crisis, and I wonder if there's any magic left.  The desire to create music has never left, but what I expect to come from it has been tempered by reality. I think there's at least one more album or maybe even a musical in me.  But I am also a data guy now, and the anecdotal data I have acquired over time tells me that fifty year old rock artists don't do very good work. Certainly nothing close to what they achieved in  their twenties. There are obviously exceptions, but I got to thinking about artists' output over time.  Are there examples of artists who not only age gracefully, but who actually get better?  So I grabbed some data.

Evaluating Songwriters' Output by Age

To start to look at this, I pulled a few of the recognized great popular songwriters from the last 50 years and looked at their creative output based on reviews of their studio albums. Because I wanted a measure of their creative ability by age, I looked at original-composition studio albums only - no cover albums and no live albums.   I included albums that artists worked on in groups (e.g. Neil Young gets credit for Deja Vu by C,S,N and Y) as well as solo albums.  The measurement I used was album ratings from  the website RateYourMusic .  It's a fan site that lets folks rate albums on a scale of 1 - 5.  There are all sorts of potential bias issues in using this system, but it was easy to get.  And given that it's an audiophiles site, my sense is that the reviews will conform somewhat to critical thinking, and might even create a more honest evaluation of artist work than if I just grabbed the Rolling Stone reviews over time.

I understand all of the reasons for not comparing artists in this way.  Despite twenty-one Academy Award nominations, Woody Allen never attends the Oscars.  His reason is that art isn't competition -  judging art is so subjective who's to say who or what is best?  After all one man's Poison  is another man's Cream.     Similarly,  Elvis Costello (featured in the viz) is famously credited with saying:  "Writing about music is like dancing about architecture - It's a really stupid thing to want to do."   I agree that using ratings - whether from fans or critics - to judge artistic merit is at best flawed and at worst a fool's exercise.

But I wanted to do it anyway. 

The first three songwriters I looked at were Bob Dylan, David Bowie, and Bruce Springsteen - all in the top twenty in  Paste Magazine's 100 Best Living Songwriters List from 2006.  What I saw from those three gave me hope for what's possible in an artist's creative Second Act.   


No surprise, all realized their creative high-points in their twenties and then fell off in their thirties and forties. But what surprised me was the rise for all three artists in their fifties that continued into their sixties. For Dylan specifically his output into his seventies has sustained a higher average score than any output since his twenties. Granted, this increase might have more to do with bias on the “RateYourMusic” site where the reviewers are hungry for anything new that’s worth anything (fan idolatry= easy grader), and it may disappear if the measure was a more objective assessment (like aggregated reviews via metacritic).  Bowie's score in his sixties is based on the reviews for one album - "The Next Day" - which was just released a week before this posting. No way to know if those high ratings will hold up over time.   

Still it was interesting to see.  So I kept adding songwriters. I stuck mostly to the Paste list, and focused primarily on living songwriters with careers that have spanned at least three decades. I did, however, make a couple of exceptions to this rule.  Neither John Lennon (deceased) nor Joe Henry (personal favorite)  were on the Paste list. I also grabbed a couple of younger songwriters from the list who I was particularly curious about: Beck, Jeff Tweedy, and Jay Farrar. 

With twenty-six songwriters in the mix two facts seem to hold in the aggregate:   The twenties are the high point and the forties are the low point and then there is a gradual trending  up in later decades.  One slight exception is for artists who started recording in their teens. For those early achievers, it's likely that the teens were their worst decade - unless you were Van Morrison, in which case you peaked at twenty.  
     
There are a ton of ways to play with this data and compare, but here are few interesting callouts.





The Comeback Kid

Brian Wilson's rise from the ashes of  a 1.58 rating for The Beach Boys release "Stars and Stripes" when he was 53 to his acclaimed "Smile" (which garnered a 3.81) when he was 61 was the biggest bounce-back.   Dylan, Joni Mitchell, and Bowie all saw solid returns to form in their 50's from the significant low points that were their 40's as well.















Descend and Level-off

Both Neil Young and David Byrne saw precipitous drops from their twenties to their thirties then leveled-off and were consistent through their fifities.   David Byrne's twenties saw the release of the first four ground-breaking Talking Heads records, peaking with the seminal   "Remain in Light" when he was 28.  He also scored Twyla Tharp's dance project "The Catherine Wheel" and produced "My Life in the Bush of Ghosts with Brian Eno". Young was even busier in his twenties,  releasing twelve albums which included records with CSN&Y, Buffalo Springfield and his top-rated solo album: "After the Gold Rush".  Young has been the most prolific member of the collective, being involved in the release of 43 studio albums over 46 years of recording.




The Decliners

Artists that just continued to slide after the heyday of their twenties include Prince, Elvis Costello, Van Morrison, and Pete Townshend.  What is particularly interesting about Townshend is the see-sawing of ratings once he started releasing solo albums in his 30's. The reviews of the Who records were significantly lower than those for his solo records. 










Later Bloomers

Of the twenty-six songwriters examined, I couldn't find an example of someone who consistently improved across decades. This may just owe to fan bias - you're never as good as your first record. But surely there are examples of folks who continue to improve. There were only three whose average rating by decade increased from their twenties to their thirties: Tom Waits, Jeff Tweedy (discussed later), and Joe Henry. Of the three, Henry saw the greatest gain.   His output consistently improved from his twenties to his forties.  His style changed considerably over that time, moving from the alt-country genre of his early records to a more expansive almost jazz-centric style in his more recent offerings.  No coincidence that Waits and Henry are shown  together here. Both are examples of artists who have doggedly explored the boundaries of their creativity and evolved significantly as songwriters across their careers. 





Tweedy vs. Farrar

Being a fan of both songwriters, I was curious to see how their paths compare. The two made up the songwriting duo of Uncle Tupelo, then went their separate ways in 1994.  At the time of the split, it was commonly held that Jay Farrar was the more evolved songwriter.  Both released albums with their new bands in 1995: Farrar released "Trace" with Son Volt and Tweedy released "A.M." with Wilco.   No surprise at the time, Farrar's record was better received.   But at age 28, that record would be Farrar's best showing as a solo artist.  Tweedy, on the other hand,  continued to grow and develop as a songwriter, with each Wilco release improving on the last.   And while Farrar's creative high-water mark was the record  "March 16–20, 1992" which they did together in their twenties as Uncle Tupelo, Tweedy eventually surpassed that level with "Yankee Hotel Foxtrot" at age 34.  Now in their 40's the two seem to be on par  (at least by Rate Your Music's standards). 




So Who's the Best?




That's a ridiculous question to ask, but we can at least quantify some results based on the data we have.  The highest mark for an album was Abbey Road at 4.38 which is credited to both Lennon and McCartney.  Lennon would have had the highest overall score for any artist if he had steered clear of collaborations with Yoko, however we are factoring in the "Whole Artist" here.   The highest decade of achievement was Lou Reed in his 20's where he notched an incredible average of 4.18 - most likely owing to the deification of the Velvet Underground by audiophiles everywhere. But Tom Waits is the one artist who's output throughout his career has been consistently good.   His average across five decades was 3.83.   Unlike his peer group, there has yet to be a significant dip or falling off that most of the other artists with at least four decades worth of releases have experienced. In fact, his lowest rated album was "Foreign Affairs" at 3.39 which he released in his twenties.  Since then only one album -  the experimental opera "The Black Rider" he made in collaboration with William S. Burroughs - saw an average score of less than 3.65.  Now in his sixties, Waits continues to crank out amazing music. 

And that gives us soon-to-be fifty year-olds hope that anything is possible. Of course, it helps a little to be a genius like Tom Waits.

~ Kyle Biehle
   March 20, 2013

The Viz

There are two versions of the interactive visualization.  For both, I have pre-selected it to display three artists: Dylan, Waits, and Elvis Costello.   A large-format, full-screen version of the viz is available here.  The small-format version below is split up into two tabs.  The first tab lists songwriters across the top, ordered by age. The chart displays each artist's average rating for all of their releases in a particular age decade. The Second tab shows all of their albums by year. To access the second tab, you can click on the tab at the top of the viz, or click on an artist picture in the Average by Age chart and be taken to that artists albums.    To clear selections, click on an image twice. 





Postscript

Part of the idea for this post came after watching Beck's mesmerizing cover of Bowie's song "Sound and Vision". It got me thinking about the similarities between these two musical shape-shifters and where they each were at different ages in their careers. While I was putting the post together, I shared it with a few friends for feedback.  Two friends pointed me to published works that actually tie in rather closely with this idea of when artists create, and both serve as great companion pieces to this one:  Malcolm Gladwell's article "Late Bloomers" and Periscopic's terrific interactive visualization How Old Were They? . The Gladwell article appeared in the New Yorker in 2008 and was also featured in his collection of essays "What the Dog Saw".  I actually borrowed the bubble-timeline technique in the "Albums by Artist view"  from the  Periscopic viz - so thanks for that Persicopic!  

~March 22, 2013

Sunday, August 12, 2012

Olympic Decathlon Gold Medalists

Do decathlon gold medalists share any specific talents? Are the winners of the gold always the fastest sprinters? Or strongest throwers? Are there particular events the champions always excel at?  Or are the champions equally great at all of the events?

To shed some light on the question, I grabbed most of the Olympic gold medal results going back to the fifties, and created the viz below. The table at the top shows the total points (Based on 1985 adjusted scoring) for each event for each competitor. The table at the bottom looks to see how each score contributed to the decathletes overall score, by measuring each score as over/under 10% of the total. The top table can be sorted by any event by clicking on the sort bars in the column-header.

View a full-screen version here

It looks like most of the winners are pretty good at sprints and long jump and pretty weak at the throws.  One obvious commonality is that none of the decathletes are very good at the 1500M.  The 1500M is the last event of the decathlon.  It's possible that the athletes are universally cooked by the end,  rather than some common shortcoming of physiology.  What's most likely is that by the 10th event, the leaders after the 9th event know the cushion they have over the other competitors to win, and only run the time necessary to win.

The earliest winners - Johnson, Mathias, and Campbell - were consistently bad at high jump and pole vault. But that probably has more to do with changes in technique and equipment over time ( the introduction of the "Fosbury Flop" in 1968 forever altered the high jump landscape).

Looking at the Over/Under table, the most "balanced" athlete is Bruce Jenner, with no event contributing more than 11% or less than 9% to his total score. Ashton Eaton is one of the more lopsided - with runs and jumps making up the largest part of his score, and throws being the weak area for him.



2012 Olympics - Men's 10M Platform Diving Final


The Men’s 10M Platform Diving Final came down to the final dive. Going into the sixth and final round of dives, it was clear who the three medalists would be, but the order was not:  less than two-tenths of a point separated the top three divers. After the fifth dive, Tom Daley - the British diving wunderkind from the Bejing Olympics – was in first place.  David Boudia from the U.S. and Bo Qui from China were tied for second, just 0.15 point back from Daley. In the final round, Boudia and Qui performed a Back 2 ½ with 2 ½ twists while Daley performed a Reverse 3 ½ tuck. All three divers “hit” their dives -none receiving a score less than 9.0 from any of the seven judges. However, of the seven scores, only 3 are kept. The top two scores and the bottom two scores are thrown out. The middle three scores are added and multiplied by the degree of difficulty.  The "Kept" scores received by the three divers were:

Boudia:  9.5 | 9.5 | 9.5 = 28.5  x 3.6 = 102.6
Qui:  9.5 | 9.5 |  9.0   = 28.0  x 3.6 = 100.8
Daley:  9.5 | 9.0 | 9.0  =  27.5  x 3.3 = 90.7

Boudia’s total “Kept” score totaled 0.5 higher than Qui’s. He had one extra 9.5 to Qui's 9.0.   Multiplying the score by Degree of Difficulty, Boudia bested Qui by less than two points, and it was enough to pass Daley and take the Gold.  Although Daley was leading going into the final round, he finished 12 points back of Boudia in third place because his dive had a degree of difficulty that was 3 tenths of a point less than Qui and Boudia’s dive.

The competition for medals was actually a four-man race  up until the fourth round.  Chinese diver Yue Lin was in the hunt until he missed his Forward 4 ½ Tuck in the Fourth Round.

The Viz

The viz below provides results by round for the top six divers in the men’s 10M platform. It contains four worksheets, all set up to filter one another. The chart on the left shows the divers by round. The results in the chart can be viewed in two ways: by “Place” or by “Points back from First" using the parameter filter at the top of the chart. When viewing by “points back from first” it’s easy to see the huge gap that forms between the top 3 and the rest of the field at the fourth round.   

The table in the middle provides stats on each diver by round, and is color coded by Round Place. 

The two sheets on the right have consistent sizing and color-encoding but are organized slightly differently. The color is average judges score – the length is the total points the divers received on each dive when factoring in degree of Difficulty (DD).  The top chart creates a view of divers by round – it’s easy to see who won which round based on points (length of bar). But also easy to see how the average score (color) does not always line up with bar length. In the sixth round, the color of Boudia, Qui, and Daley are almost identical, but the points Daley earned were noticeably less.

 The bottom chart lists all of the dives performed by the top six divers and makes it easy to compare who performed best on each dive.  You can also see that there is not a great deal of variety in the dives that these elite sixe are performing. All six did the same back, , reverse, inward, and twisting dives. 

A full-screen view of the viz is available here

Daley vs. Qui

Qui and Daley performed the exact same six dives. Like prize-fighters trading rounds, Daley and Qui went back and forth.  Daley won the Back 3 ½ and Forward 4 ½ , Qui won the Inward 3 ½ and they were even on their armstand and reverse dives. But where Qui won it was on the Back 2 ½ with 2 ½ twists.  On that dive, Qui averaged a 9.3 and Daley an 8.5 creating a nine point difference which was the margin between silver and bronze.



Saturday, July 28, 2012

International Swimming Finals of the Past 13 Years

Here is a companion visualization I put together to go along with my post from last year which looked at the impact of swimsuit technology on world records. This is a pretty handy reference for looking up swimmers who are competing in London this week.

The viz contains all of the medal winners from all of the international competitions - Olympics and World Championships - from the Sydney Olympics in 2000 to the London Games in 2012. There are three components to the viz: a ranking of swimmers by medal count, a table with all of the results and times, and "hidden" view that plots historical times across meets for a swimmer.  The "Dreaded Pies" are sized by medal count. The pie-ranking and the table are ordered by medal count. No surprise - Michael Phelps and Ryan Lochte top the list. Select a swimmer in the pie view or the table and their historical times will appear.

A full-screen version of the viz is available here.

The viz can be filtered a number of different ways including stroke, event, gender, and meet.  You can also do wildcard searches for individual swimmers.  The filters can be used individually or in combination to see who was the most decorated swimmer by different criteria.  For example, to find the most decorated American female freestyler, you can filter on stroke="Freestyle", Gender="Women", and Nationality="United States"  reveals that Kate Ziegler has collected the most iron since Sydney.



Sunday, July 22, 2012

The 2012 Tour de France General Classification History by Stage


On Sunday, July 22, Bradley Wiggins became the first Brit to win the Tour de France.   He did it with an amazing supporting cast, including Chris Froome who finished second overall - only three minutes behind Wiggins after more than three weeks of racing. Some think that Froome was strong enough to beat his captain, but he stuck to the script and worked for Wiggins, and Team Sky placed two riders on the podium in Paris.

Sky wasn't the only team with some leader drama at the tour. Cadel Evans, the defending champion on BMC, also had a supporting cast member nipping at his heels. American Tejay Van Garderen placed fifth overall in the GC -  two places ahead of Evans - and took the "Best Young Rider" award.   Like Froome, Van Garderen stuck to the script and supported Evans throughout most of the tour.  However, on Stage 16 in the Pyrenees, when Evans was fighting a stomach bug, Van Garderen picked up three minutes on his captain and leap-frogged him in the GC. 


The Viz

This viz plots all of the riders’ positions in the General Classification across the first 19 Stages of the race.   The riders are color-coded by team. The line chart shows the progress and position of each rider across all the stages.  Position is plotted on the vertical axis and Stage # is on the horizontal. Reference lines have been added to show when “shake-up” stages occurred – either mountain-top finishes or individual time trials.

 There are two lists which can be used to filter the line chart. The "Rider" list on the left (which is stack-ranked by the rider's place after Stage 19) will pull up that rider’s whole team with the selected rider highlighted.  Use the "Ctrl" key to highlight multiple riders at once. The "Team" list on the right will pull up all of the riders from a team.  Those riders that did not finish (the "DNFs") are listed at the bottom of the rider list grouped by the last stage they completed before dropping out.  There is also a "wildcard" search field above the rider list to search for riders by name.


The default line chart view shows riders by their position on the GC. You can also show riders by "Time from First". This ranks the riders by fractional hourly time gaps. The y-axis is set to "logarithmic" view so that there is more separation of riders at the top of the GC.  Exact info (or rider position and time from first) is available in the mouse-over tooltips. Additionally you can filter specific stages by using the slider. The Stage labeled "0" is the Prologue.


Thursday, May 31, 2012

Six Decades of the Billboard Hot 100 Singles Chart




Following up on a previous post, I'm re-purposing Billboard Chart Data I used to create the Billboard Wayback Machine.

This viz looks at the 28,000+ songs that appeared on The Billboard Hot 100 chart between 1950 and 2012. The viz contains two charts: A bar chart and a scatterplot. The bar chart stack ranks artists, and can be viewed in two ways: by total tracks the artist placed in the Hot 100 and the number of weeks the artists tracks were on the chart.  Artist tracks are grouped and color-coded by ranges of the tracks peak position on the chart: #1's,  Top 10's Top 40's 41-100, and Year #1 which is the top song of the year based on chart success.    When an artist (or any mark) on the bar chart is selected, the scatterplot will populate with the tracks associated with the selection. The color coding of tracks by chart range is consistent across the two sheets. Each track's vertical position is determined by the song's annual ranking  for all songs in the year (determined by combination of peak position and weeks on the chart). The horizontal position indicates the year the song was on the chart. The viz is pre-filtered to show Madonna's tracks. The scatterplot resets when another artist is selected from the bar chart.

The filters can be used to refine the view by year, artist, genre, track and chart range. So if a you wanted to see which artist had the most Top10 Hits in the 1980's,  set years = 1980 - 1989 and Chart Ranges = #1 and 2-10.

Full Screen version is available here.

What is most surprising is how drastically the chart changes when the ranking measure is changed. When "Total Weeks on Chart" is selected the list doesn't reveal a lot of surprises:  Elvis is Number one, then Elton John, then Madonna. When the measure is changed to "Hot 100 Tracks" which is the number of tracks the artist has placed in the Hot 100, there is a surprise. The top artist by this measure is The Glee Cast. In less than three years, the Glee Cast has placed over 200 tracks on the Billboard chart. However, only three of those tracks have cracked the top ten. This owes to the nature of how tracks chart now - iTunes downloads is a huge driver. However, without much radio airplay,  the Glee tracks are seldom on the chart more than one or two weeks.




Wednesday, April 25, 2012

Does Anyone Care How Hard I Ride?

I have been collecting GPS data on my bike rides for years and have made several attempts over that time to visualize it.   A recent post on Tableau Public authored by Tor Stahl spurred me to dust off a viz I put together about a year ago and update it with my most recent rides.

One of the reasons I've never published my previous bike vizzes is that I really couldn't imagine who besides myself would care how often, hard, high, or far I've ridden my bike.  I built these visualizations to gain more insight into my own riding so that I could  see change  - improvement and backsliding - over time. I doubt anyone cares about my data, but it's possible that others might want to grab the workbook and use it to viz their own bike ride data.




This viz  contains all of my rides from the first four months of 2012.    All of the rides are in Marin County and most of them were on a geared mountain bike.  The color on the dashboard is heart rate, grouped into the 6 heart rate zones. For the bar chart , map, and elevation profile the color is the measured heart rate. In the "Rides by Week" view the color represents average heart rate for the entire ride. In that same view, the size of each mark indicates elevation.   I have a few routes that I ride regularly - Camp Tamarancho  (Tamo) being my most regular ride. In order to compare performance on a route, I have grouped the "Ride Times" chart by route. The redder the bar, the harder I went - the shorter the bar the faster I rode. As I move into summer and fall, the time per ride should shrink (bars get shorter), then expand again during the winter as I move into rainy season hibernation.  Selecting one ride or group of rides in the "Ride Times" or "Rides by Week" view will filter the map and summary table to show just those rides and will pull up the elevation profile.

A full-screen version of the dashboard is available here

Sunday, October 16, 2011

The Billboard Wayback Machine

When I was in grade school, the top 40 was a huge part of our family fabric. We were in our car a lot back then, and the AM radio was always on. I started buying singles when I was nine years old, and I’m still proud of the fact that my first purchase was “Right Place Wrong Time” by Dr. John. I’m not so quick to share my first album purchase with friends, however. “Right Place Wrong Time” was on the charts the summer of 1973. That was the summer my family moved from Ohio to Illinois. So many memories and feelings were tied up in those few months of starting over - new house, school, neighborhood, friends. Everything was different, except for the songs. The songs were the same. When we made it to Illinois, the new AM radio station – KXOK in St. Louis- was playing all the same songs as the station we left behind in Ohio.One of the first things my two brothers and I did when we arrived at our new home was get our Dad to take us to the Base Exchange (we lived on Scott Air Force Base) and buy the 45’s of our favorite songs. I got Dr. John and my brother Jeff picked up “The Cisco Kid” by War. I think my little brother Brian grabbed "Hocus Pocus" by Focus. We added to our collection over the summer and played those records to death. We'd listen to them all day and go to sleep to them at night, stacking them up ten high on our record player to create our thirty minute evening playlist. When one song finished, the arm would swing back and trigger the next 45 to drop ~Kathunk ~ and the next song would start. The songs from that summer are still magic to me.

A few years ago (years before the appearance of this game) I developed a trivia board game called “HearShot”. It was an audio trivia game, where a sound byte from a popular movie, song, or TV show was played and the players had to answer a "who, what or when" question which mapped to artist, title, or year. For example, the "who, what, when" for this movie quote: “This was No boating accident!” would be: Richard Dreyfuss, “Jaws”, and 1975 respectively. A common complaint first-time players would make before attempting the game was how impossible it was to have to guess the year something came out. Most people were surprised to discover that they were actually quite good at it. If you have a point of reference in your personal history, it’s not hard to conjure when something was popular. And songs especially can trigger memories. The quick hit one can get from a song can be so powerful that it can transport you back in time - like an aural version of Mr. Peabody’s Wayback Machine. To this day, when I hear “Alone Again (Naturally)”, I smell chlorine. The song was huge during the Summer of ’72, and I spent every day of that summer at Rona Hills public pool where they played the radio over the pool loudspeakers.

Hearing a song is a common memory trigger for people, but what about seeing songs? Just reading the titles and the artist names? Would that provide the same impact? And if you could group titles and artists together by a range of dates that had some personal significance, how powerful might that be? Billboard has a tool that allows you to browse the Top Ten songs from any given chart week, but I want to see more than one week, deeper than the Top Ten, and more than ten songs at a time.

This visualization contains the 10,000+ songs that were in the Billboard Top 40 between January 1964 (when the Beatles "I Want to Hold Your Hand" entered the chart) and March 2011. The view is pre-filtered to display the 98 songs that charted in the summer of 1973. Among them were "The Cisco Kid", "Right Place, Wrong Time" and "Hocus Pocus". It was a jolt to see those songs and others that I had forgotten about, but remembered instantly upon seeing the titles. Just seeing them brought the melodies and lyrics back, and for a brief moment I could remember what it felt like to be nine years old.


The viz allows the user to select a specific range of dates using the sliders or entering manually, and see the songs from that period ordered by popularity in the period selected. The numbered ranking is based on performance in the selected date range: chart positions and weeks on the chart. Every song that made it into the Billboard Top 40 in that time period will be displayed. The color coding which indicates a track's peak position overall.

Clicking on one of the songs on the list will display a bar chart for the artist, showing all of their Top 40 hits in chronlogical order, color coded by song peak. Each song makes up a segment of the artist bar - the length of segment denotes weeks in the top 40. So the length of the bar is the total weeks the Artist was in the Top 40.

There is also a line chart which displays each individual song's progression on the chart from entry point, to peak, to exit: one point for each week in the Top 40. Ctrl-click on the chart list and you can see multiple songs at once. Highlight an artist and you can see all of their songs, color coded by peak chart range, and the spread of their chart paths over time.

The viz can also be filtered by artist or track, allowing the user to select a single artist or song or several at once and compare their chart histories.

The Source of the Billboard Chart Data

The data for the visualization came from the Whitburn Project dataset which I discovered through Infochimps. The dataset was not made available by Infochimps specifically, but the dataset description referenced Andy Baio’s blog Waxy.org where he describes the dataset in detail, and does some analysis on the tracks. The dataset is a labor of love of a group of pop-music enthusiasts who have come together to catalog and preserve America’s pop music history.

From Baio:


Named after Joel Whitburn and his authoritative Billboard books, the Whitburn Project began in 1998, when a group of 15 collectors pooled their resources to create an MP3 collection of every single in the top 40. The Excel spreadsheets were created to help them verify their collections were complete, with new versions updated and re-uploaded to the newsgroups weekly.”“For the last ten years, obsessive record collectors in Usenet have been working on the Whitburn Project — a huge undertaking to preserve and share high-quality recordings of every popular song since the 1890s. To assist their efforts, they've created a spreadsheet of 37,000 songs and 112 columns of raw data, including each song's duration, beats-per-minute, songwriters, label, and week-by-week chart

There is some question as to the legality of sharing this data, but I'm hopeful that my re-appropriation of the data here falls under the terms of fair use.

Friday, July 29, 2011

The End of the Swimsuit Wars

(Is it the end of world records, too?)

In 2007, Speedo introduced the first of the "High Tech" full-length body suits - the Fastskin Pro - in time for the 2007 World Aquatic championships. The suit compressed the swimmer’s body to give it a more streamlined shape in the water, and contained polyurethane panels which significantly reduced drag. In 2008, Speedo improved upon the FS Pro design and came out with the LZR Racer in time for the Bejing Olympics. 25 world records were broken during the Olympics - 23 of them by swimmers wearing the LZR suit.
In the wake of the Speedo's success, other manufacturers jumped into the high-tech suit game, hoping to rival Speedo's results. Manufacturers Jaked and Arena came out with suits made entirely of polyurethane. These body suits reduced drag, compressed the body, and also increased the swimmer's buoyancy. Some swimmers started wearing two suits to take advantage of the added buoyancy.

At the 2009 World Aquatic Championships in Rome, 43 world records were broken, but none more spectacularly than the Men's 200M Freestyle mark. In that race Michael Phelps, winner of eight gold medals at the Bejing Olympics, took second to an unheralded German swimmer named Paul Biederman. Biederman clocked an astounding 1:42.00, beating Phelps and taking nearly a second off the world record Phelps set in Bejing. Biederman also shattered the world record in the 400m freestyle at the meet. Biederman was wearing the Arena X-glide suit, and he admitted that the suit made him faster - perhaps as much as two seconds faster. Biederman also stated that until FINA - the governing body of international aquatic competition - disallowed the X-glide suit, he was going to wear it. Phelps' coach was outraged by the result and threatened to pull Phelps from international competition until something was done about the high tech suit issue. I’m not sure how upset Phelps' coach was when the records were going to those racers in the Speedo LZR, but It was clear that technology, and swimmers’ access to it, was having a significant impact on the results and the history books. In the 200m Freestyle final in 2009, Phelps was in the fastest suit made by his sponsor Speedo, while Biederman was in the fastest suit available.

Following the 2009 World Championships, FINA banned High Tech suits from competition - body-length suits were no longer allowed and suit fabric had to be made of textiles (woven fabric).

Turning Back the Clock
The 2011 World Aquatic Championships took place from July 24 - July 31 in Shanghai. It was the first major international meet since the FINA rule change took effect at the end of 2009. Many assumed times would be slower as a result of the suit ban. I was curious to see "visually" what effect the ban would have on times, so I created the interactive viz below. As expected, the times were slower, some returning to the levels that they were back in 2005 before the tech suits hit the market. While 43 world records were set in Rome in 2009, only two world records were set in the 2011 World Championships in Shanghai: Ryan Lochte in the 200m individual medley and Sun Yang from China in the 1500m Freestyle. Both were set in the finals. These are the only two world records that have been set since FINA banned the suits 19 months ago. The impact of the suit technology on winning times is undeniable. World records will be incredibly hard to come by for years to come.
And What of the Biederman/Phelps Rivalry?
Following his success in Rome, Biederman said: "The suits make a difference. I hope there will be a time when I can beat Michael Phelps without these suits. I hope next year. I hope it's really soon." Biederman got his chance on Tuesday, July 26, 2011. Both he and Phelps made it to the Men's 200m final in Shanghai - and so did Ryan Lochte. In the Phelps/Biederman rematch, Phelps beat Biederman by .09 seconds, but Ryan Lochte beat them both with a time of 1:44.04 to take the gold. As Biederman predicted, it was a full two seconds slower than his world record time in Rome.

The Viz
This viz looks at the medal finishes in individual events (no relays) from the eight major international swimming competitions over the past twelve years. These meets include three Olympic games and five World Aquatics Championships.

The visualization shows the shape of the medal-winning times over the past 12 years. The lines follow the medal times - not a particular swimmer.  The “hockey stick” shape of the winning times is obvious - almost every event achieved it's fastest (lowest) time at the World Championships in 2009 and then took a left turn into slower territory in 2011. The viz allows you to view multiple events at once (side-by-side) or compare events by gender (stacked). The Y- axis height is determined by the historical time gaps (in seconds) for each event. The viz is pre-filtered to show the 200m freestyle for both men and women. Tooltip hovers provide specifics on the medalist (Name, nationality, time, records set). It is pre-filtered to show the 200m Freestyle event, but can be filtered to show any event or stroke.  Details of swimmers and times are available in the mouse-over tool-tip. 



Friday, October 8, 2010

Austin City Limits Popularity Contest


The Austin City Limits Music Festival is taking place this weekend. Browsing the list of performers, a few questions came to mind. First, what were The Eagles doing there? And second, why was Richard Thompson so far down the list? How is the billing order established when there are 125 bands to sort? As it turns out, Richard Thompson probably got better than he deserved.

As is the case with all mega-festivals and Hollywood blockbusters, the billing order is determined by the perceived "StarPower" of the artist. This ranking may also influenced by the demands of the artists, unless you're Spinal Tap, in which case your billing will forever follow the "Puppet Show".

A nice feature for the ACL Festival attendees is the ability to create custom online schedules, filling in the bands they plan to see. The number of fans who are planning to see each artist is visible on the line-up page when you mouse-over the artist's name. So, with a little elfin magic, we can see how the billing order established by the festival organizers matches up with an artist's actual popularity, as voted on by the 40,000 plus attendees (or wannabe attendees) who have taken the time to fill out a custom schedule.

This viz plots the ACL organizers' rank along the X-axis and the Fan rank along the Y-Axis. If the ACL rank and the Fan rank are in agreement, the artist's name should fall on the superimposed trend line. If the artist is more popular than ACL figured, they will be to the left of the line. If they don't have quite the draw that the artist's agent had hoped for, then they will be to the right of the line. The color coding follows the text sizing on the line-up page where Blue are the big names, Orange are the mid tiers, and Grays are everybody else.




For the most part, the organizers got it right. You can see how the tiers of artists are grouped together: blue on top, orange in the middle and gray at the bottom. When you see the colors overlapping (e.g. When gray is above orange), this is where the billing doesn't match the fan votes.

The most popular artist based on schedule adds is the one closest to the top, which turns out to be the Black Keys. Despite their billing of 18th (Following top/bottom left/right ordering), 53% of of the attendees are planning on seeing them. Unfortunately, Richard Thompson is pushing left. ACL have him 45th - the fans have him at 90.

The biggest outlier is The Verve Pipe who were billed 118th out of 125, but ranked 36th by the fans with 6,120 schedule adds. You may not remember The Verve Pipe's hit from 1996 The Freshmen, but clearly 15% of the ACL attendees remember it (or at least recognize the name). Or maybe half of those people are getting the Verve Pipe mixed up with The Verve. I know I always did.

So with the added benefit of double name recognition/confusion, why are the Verve Pipe so far down the bill? Well, it turns out they've just recently re-formed and are now playing children's music. They are actually performing at the Festival-within-a-Festival: Austin Kiddie Limits. So if the 6,000 fans who are planning on seeing them are disappointed, or if all of the little tiny chairs are taken by the time they arrive, you can't blame the Pipe: they "Won't be held responsible".

And going forward, if you get the two bands confused, you can keep them straight by remembering that the Verve PIPE play children's music, and the Verve were sued by the Rolling Stones.

~kb

Wednesday, June 30, 2010

The NBA Pantheon - Has Kobe Cracked the Top 12 Yet?


In Bill Simmons' "The Book of Basketball", he undertakes the formidable task of identifying the top 96 professional basketball players of all time and ranking them in order of greatness. He brings an incredible breadth of knowledge to the task, and makes his case for each player based on stats, achievements, interviews, and opinion.

Simmons is first and foremost a basketball fan. He grew up going to Celtics games with his Dad in the 70's and 80's. As such, he's not an
entirely impartial judge of greatness, and cops to a few things right out of the gate:

1) He's an unabashed Celtics fan
2) He believes greatness is measured more by team success (exemplified by Bill Russell) than individual stats (exemplified by Wilt Chamberlain)
3) He can't stand Kobe Bryant

So we know going in that Simmons will base his rankings more on rings than scoring titles.
And we can assume Larry Bird will be deified while Bryant's achievements will be discounted .

What are the measures of Greatness?

To help make his case, Simmons includes all of the players career stats and accomplishments. It's an exhaustive, season by season list. In order to assist in an objective assessment of greatness, I have taken some of these individual achievements and assigned a point score, weighting each achievement based on relevance to greatness. My point system is a total swag and obviously subjective, but it can be applied objectively to all players. The achievements ranked in order by award amount are:



The achievements with an "*" are long-standing awards which came into existence before 1960 and can be used to compare all players across the last 50+ years. The NBA Finals MVP was introduced in 1969, so can't be used as an objective measure across time. I'm giving an NBA Championship the most points because all of the greats (with the exception of perhaps Wilt Chamberlain) concur that rings matter most. So a Championship should count more than an NBA MVP. "AS_MVP" is All-Star MVP. The 'Seasons Played" marker ("+") does not have any points associated with it, but provides a way to monitor the seasons where a player was active.

Why not include other stats like total points, rebounds, assists, etc in the measure? Because each player contributes differently and fills up the stat sheet differently. The weighting of different stats (rebounds, points, assists, steals) is hard to normalize. Moses Malone was a rebounding machine, Jordan was a scoring machine, Magic was a play maker. Ultimately, how a player's unique talents contribute to the success of their team is what should matter most, and that success can be objectively assessed based on achievement.

So Who Does Simmons Think is the Greatest?

Simmons' Pyramid consists of five levels. The top level is referred to as "The Pantheon" and consists of the twelve greatest basketball players of all time. Those players (as of 2008 when Simmons wrote the book) were:

1) Michael Jordan
2) Bill Russell
3) Kareem Abdul-Jabar
4) Magic Johnson
5) Larry Bird
6) Wilt Chamberlain
7) Tim Duncan
8) Jerry West
9) Oscar Robertson
10) Hakeem Olajuwan
11) Shaquille O'Neal
12) Moses Malone

Kobe Bryant was ranked 15th.

Because The Pantheon is based on achievement to date, it will never be a static assemblage of players. New players will always be knocking on the door. Kobe Bryant has been knocking for a while, but is he in yet? And, more importantly, has he passed Bird yet?

Has Kobe passed Larry yet?

When Simmons wrote the book, Bryant and Bird each had 3 NBA Titles to their names. Bird also had 3 League MVPs and 2 Finals MVPs. Bryant had no Finals MVPs (Shaquille O'Neal was the Alpha Dog on the three Championship teams) and one league MVP. Looking at both Bird and Kobe's careers over the last 14 years, we can see that the Birds placement over Bryant was accurate after 12 Seasons. Bird had clearly achieved more than Bryant up to that point.

Bryant and Bird


The Lakers won the title in Kobe's 13th season and Kobe was the Finals MVP. The Lakers won another championship in Kobe's 14th season (2009-10), giving Kobe five titles to his name and a second Finals MVP award. While Larry Bird was an early achiever, plateauing after 9 years, Kobe got off to a slow start but has continued to climb every year since his 4th Season. After 13 Seasons, Kobe was dead even with Bird, and passed him after 14 Seasons.

It would appear that Kobe is now in the Pantheon


How do you account for different career Lengths?

Longevity plays a huge role in a player's achievement score. While Bird's career was cut short by injuries, Kareeem Abdul Jabbar played for 20 seasons, won a championship in his nineteenth season, and finals MVP awards in both his second season (1971) and his sixteenth (1985). Kareem kept going and kept achieving at a high level. Most players are like Bird, they peak and then level-off.

Bill Walton (Simmons rank: 27) is perhaps the best example of a player who's potential was cut short by injury. He was healthy through 4 seasons. In that time he had racked up a championship, finals MVP, and an NBA MVP. His achievements through 4 years surpassed both Larry Bird and Michael Jordan, and had him fifth all time behind Magic Johnson at that point in their careers.


Some players, like Walton, burn brightly for short periods then quickly fade. At Walton's peak you could make an argument for him being in the Pantheon. Hakeem the Dream had a transcendent season in 93-94 where he won the NBA MVP, Defensive MVP, a championship, and the Finals MVP. No other player accomplished all of that in one season. How do you account for that kind of single season dominance in the all-time measure? Simmons accounted for it by placing Olajuwon in the Pantheon, when his total careers stats don't quite afford him that high of a rank.

Whose Place in the Pantheon Did Kobe Take?

Kobe's clearly in the Pantheon now. But how do you compare someone who's still adding to their achievements to those whose careers are complete?

Most players have reached their peak by their 14th season and then plateau. The 97-98 Season was the 14th Season after Jordan entered the league (but only his 12th complete season as a result of the baseball sabbatical). In the 97-98 Season, Jordan won his 5th NBA MVP, the Bulls defeated the Utah Jazz for Jordan's 6th title and he also won his 6th Finals MVP award. Jordan retired (for the second time) after hitting the championship-clinching shot at the buzzer in game six. Magic, Bird and Russell were effectively done after 13 Seasons (Magic briefly attempted a comeback years later). Chamberlain and West retired after 14. Kareem is the one Simmons Pantheon player who was still going strong after 14 seasons - until Kobe. Assessing all of the Pantheon players at the 14 year mark is one way to see where Kobe currently stands.

The attached viz has pre-selected Simmons Pantheon of 12 plus Kobe. It is pre-filtered to only show the first 14 seasons of each player and only include the achievements that existed prior to 1960. The viz is interactive and can be filtered on player, number of seasons, and achievements. ALL of the awards, seasons, and 96 Players in the Pyramid are available for review.


Both the player color legend and the Achievements by Year chart are sorted in order. In addition, I'm displaying Simmons Rank next to each players name so you can see where he put them.

Through 14 seasons, Russell is first, Jordan is second and Bryant is in seventh position, just behind Jerry West and just in front of Larry Bird . The player who Bryant displaced in the Pantheon was Hakeem Olajuwan.

If you were to include ALL of the players in Simmons Pyramid using this criteria, 3 other Celtics who were on the championship teams of the 60's jump into the Pantheon: Cousy, Havlicek, and Sam Jones, with Cousy ( a Simmons Level 4 player) jumping from 21st to fifth position all-time. Shaquille O'Neal, Oscar Robertson, and Moses Malone are the Pantheon Players who are displaced (along with Hakeem).

So Who's in The All Time Pantheon now
?

If you filter the viz to include ALL of achievement scores (pre and post 1960), ALL of the Seasons, and ALL 96 Players, the Simmons top 4 are the same players but change order. Kobe is firmly in the middle and still climbing - looking down at Bird. . . But he's still looking up at Jordan.

1) Jordan (1)
2) Jabbar (3)
3) Russell (2) * No Finals MVP
4) Johnson (4)
5) Duncan (7)
6) Bryant (15)
7) O'Neal (11)
8) Chamberlain (6)
9) Havlicek (13)
10) West (7)
11) Bird (5)
12) Cousy (21) *No Finals MVP


There was an error in this gadget

Followers