Tag Archives: directory tree

Processing music tags with Python

I was looking for the optimal way to traverse the directory tree with Python and found out that os.walk is the answer. To put this into practice, I wrote the following script which traverses the music library and finds out which MP3s haven’t been properly tagged. In other words, it will print the full paths of all MP3s that have any of the artist, album or title tracks missing. In my case, 502 of the 10466 songs had bad tags.

[sourcecode language=”python”]
#!/usr/bin/python

import os
import id3reader

rootdir = "/home/thameera/Music"
processed = 0
bad = 0

for thisDir, subdirList, fileList in os.walk(rootdir):
for fname in fileList:
if fname[-3:] == 'mp3':
processed += 1
fullPath = os.path.join(thisDir, fname)

try:
id3r = id3reader.Reader(fullPath)
except:
bad += 1
print "Error extracting id3 info from %s" % fullPath
continue

if None in (id3r.getValue('performer'), id3r.getValue('album'), id3r.getValue('title')):
bad += 1
print fullPath

print "%d MP3s processed. Found bad tags in %d of them" % (processed, bad)
[/sourcecode]

It uses the id3reader library to parse the tags. I don’t think it’ll work if you have files with other formats, like wma or flac.

This script can be further expanded to ignore certain directories, output only one entry per each directory that has bad tags, check for other tags as well, check if the artist tag does not match the name of the parent directory, etc, etc.