Tuesday, December 2, 2014

python hachoir Library for Media Metadata Extraction

In case if you want to extract metadata from a media file, python's hachoir library will be a handy tool for same. You may find the initial start-up material from here



You need to download following three packages:

  • hachoir_core
  • hachoir_parser
  • hachoir_metadata


running "setup.py install" command on windows will automatically put these libraries inside python Libs site-packages folder so that they immediately become available and you may run the command like "from hachoir_core.error import HachoirError" on python terminal



Download and install everything from here:



Copy following code in a python file and run it to see in action:

from hachoir_core.error import HachoirError
from hachoir_core.cmd_line import unicodeFilename
from hachoir_parser import createParser
from hachoir_core.tools import makePrintable
from hachoir_metadata import extractMetadata
from hachoir_core.i18n import getTerminalCharset
from sys import argv, stderr, exit

if len(argv) != 2:
    print >>stderr, "usage: %s filename" % argv[0]
    exit(1)
filename = argv[1]
filename, realname = unicodeFilename(filename), filename
parser = createParser(filename, realname)
if not parser:
    print >>stderr, "Unable to parse file"
    exit(1)
try:
    metadata = extractMetadata(parser)
except HachoirError, err:
    print "Metadata extraction error: %s" % unicode(err)
    metadata = None
if not metadata:
    print "Unable to extract metadata"
    exit(1)

text = metadata.exportPlaintext()
charset = getTerminalCharset()
for line in text:
    print makePrintable(line, charset)


Result:


No comments:

Post a Comment