Summary
- Anna’s Archive acquired metadata for nearly 256 million tracks.
- The organization also archived nearly 86 million audio files, representing over 99% of listens.
- The metadata is available now, and the audio files are coming soon. The data is being released as a bulk torrent with a file size of nearly 300 TB.
The group behind the media archive site Anna’s Archive announced that it has successfully scraped almost the entire Spotify library. The organization outlined the endeavor and its results, as well as what it plans to do with the data, in a blog post.
The data includes metadata for nearly 256 million tracks. That’s roughly 99.9% of the streaming service’s library, making this by far the largest publicly available database of music metadata. Anna’s Archive also successfully archived approximately 86 million music files, representing 99.6% of listens on Spotify. The archive is a whopping 300 TB.
Anna’s Archive is releasing the data in stages. Currently, the metadata is available, with the songs themselves coming next. The group says the current plan is to distribute the whole thing as a bulk torrent “aimed at preservation,” but that it will consider releasing individual files if there is enough interest.
The motive
This was probably inevitable
Spotify’s library is enormous, but it’s also constantly changing, so preserving a snapshot of that library is a big deal. Like the next Spotify price hike, this was bound to happen eventually. If it wasn’t Anna’s Archive, it would have been another group.
For its part, Anna’s Archive is framing this as an attempt to preserve something culturally significant. The organization’s mission is to “preserve humanity’s knowledge and culture,” and this sort of thing aligns well with that mission. There are libraries of music and metadata available, but Anna’s Archive points out that they are incomplete and tend to focus too heavily on the most popular songs and artists.
Spotify’s reaction
The streaming giant is not happy
Spotify is viewing the incident as piracy and responding accordingly. In a statement, the company said it has “identified and disabled” accounts related to the scraping and that it’s implementing safeguards to prevent similar actions in the future (via Billboard).
The company ended the statement by saying, “Since day one, we have stood with the artist community against piracy, and we are actively working with our industry partners to protect creators and defend their rights.” This response isn’t unexpected, though it’s a bit ironic, considering the company has been accused of doing exactly the opposite by significantly underpaying artists.
“Since day one, we have stood with the artist community against piracy, and we are actively working with our industry partners to protect creators and defend their rights.”
It would not be surprising for Spotify to take some sort of legal action, although it doesn’t appear to have initiated anything at this time. Whether that legal action would be effective is a different story — it may be too late to put this cat back in the bag. Frankly, we wish the company would spend less time fighting this and more time sorting out its terrible library management, but that’s another conversation.













