So a while ago my friend who just started using RStudio and is a bit overexcited about working with data told me about getting all your spotify data is totally doable, all you gotta do is email spotify. Follow the link here for more details:

Spotify will email your data in a zip format in JSON files. I use the jsonlite package in R to read the data in.

urlspot = ""

spot0 = jsonlite::read_json(paste(urlspot,"StreamingHistory0.json", sep = ""), simplifyVector = T)

spot1 = jsonlite::read_json(paste(urlspot,"StreamingHistory1.json", sep = ""), simplifyVector = T)

spot = rbind(spot0, spot1)

The data is pretty straightforward: the time the track ended streaming, artist and track name, and the milliseconds it was listened to. I’ll use shiny to visualise my streaming trends.

using lubridate to get end times

spot$end_time = as.POSIXct(strptime(spot$endTime, "%Y-%m-%d %H:%M"))
spot$date = date(spot$end_time)
spot$month = month(spot$date, label = T)

customm = function(date){
  temp = strsplit(date, ' ') %>% unlist
  temp2 = temp[2]
spot$only_time = parse_time(sapply(spot$endTime, customm))
my_seconds <- period_to_seconds(hms(spot$only_time))
myIntervals <- c("0 AM - 6 AM", "6 AM - 12 PM", "12 PM - 6 PM", "6 PM - 0 AM")
spot$interval <- myIntervals[findInterval(my_seconds, c(0, 6, 12, 18, 24) * 3600)]

##I want to group by interval, trackName, sum up the milliseconds, and get highest milisecond for each interval arrranged by trackname

interval_artist = spot %>% group_by(interval, trackName) %>% summarise(s = sum(msPlayed)) %>% arrange(-s) %>% top_n(20, s)

For shiny documents/chunks, make sure cache = FALSE. Markdown can’t cache shiny stuff since the reactive function already does that.

Shiny can be used to create some pretty interactive visualisations. I wanted to see what kind of music I listen to monthly, and what times. A simple if-else clause in your ggplot can simplify visualisation according to user specification.