Song Repetition Visualised

Repetition pattern of lyrics in a Beatles song. [OC] from r/dataisbeautiful

I decided to recreate an interesting visual for song repetition by Colin Morris. The visual can be seen in this vox video, and at his site: SongSim.

How to read the graph

The y-axis shows the lyrics as they occur, with the first lyric at the top. The lyrics are also occuring along the x-axis, but are not shown. At every point where a lyric occurs on the y-axis, you can look across the x-axis to see all occurances of that specific word.

The diagonal appears because both axes show the word in the same order.

Method of creation

The code for this visual can be found here.

The lyrics were sourced from kaggle, and have been word-tokenized.

Then I iterated over the lyrics in two nested loops, comparing word_a to word_b. If equal, take the indexes for both words, and use them as coordinates for a scatterplot.

Alternate visuals

After a comment about limiting the y-axis to unique words only as they occur, I made the following visualisation for First Blood by Kavinsky alt-visual

What I learned / Feedback

I made some updates based on reddit feedback:

  • changed the colourmap to be diagonal instead of horizontal.

The way i flipped the y-axis was sub-optimal, instead of reversing the data, I could have instead done:

ax.invert_yaxis()

I could have explored using plt.imshow instead of plt.scatter, but the scatterplot allows for changing shape of points.

I could mask the top half of the graph, as it duplicates information; reflected across y = n-x diagonal.

Written on January 25, 2019