I have a very simple task - I have a list of image and video files and I'll like to tabulate the creation date for each using the available EXIF data. I'm using pyexiftool for the actual data extraction.
I can pull the data out without a problem, but the resulting JSON output has a very strange shape. Each record has one field, but that field may contain 2 or 3 or multiple bits of information.
For example, some image files contain XMP:CreateDate
and EXIF:CreateDate
, whereas MOV files contain 'QuickTime:CreateDate' (I don't know what the fields would be for other file formats).
[{'SourceFile': '/Users/Documents/Projects/ExifData/temp/IMG_20200422_085514.JPG', 'EXIF:CreateDate': '2020:04:22 08:55:14', 'XMP:CreateDate': '2020:04:22 08:55:14'}, {'SourceFile': '/Users/Documents/Projects/ExifData/temp/IMG_20200423_091856.JPG', 'EXIF:CreateDate': '2020:04:23 09:18:57'}, {'SourceFile': '/Users/Documents/Projects/ExifData/temp/IMG_20200423_091859.JPG', 'EXIF:CreateDate': '2020:04:23 09:19:00', 'XMP:CreateDate': '2020:04:23 09:19:00'}, {'SourceFile': '/Users/Documents/Projects/ExifData/temp/MOV_0004.mp4', 'QuickTime:CreateDate': '2017:03:11 13:05:59'}, {'SourceFile': '/Users/Documents/Projects/ExifData/temp/MOV_0005.mp4', 'QuickTime:CreateDate': '2017:03:11 13:08:26'}, {'SourceFile': '/Users/Documents/Projects/ExifData/temp/MOV_0006.mp4', 'QuickTime:CreateDate': '2017:03:11 13:09:17'}, {'SourceFile': '/Users/Documents/Projects/ExifData/temp/MOV_0035.mp4', 'QuickTime:CreateDate': '2017:03:12 14:08:55'}]
I am quite lost on how to parse this file and I can't loop through it as I would a regular JSON file. I only want to extract only a filename and creation datetime. I'd appreciate any advice.
Thanks.
EDIT The code that produces that 'JSON' output is this,
def old_main():
dir_name = '/Users/Documents/Projects/ExifData/temp/'
tags = ["File Name", "CreateDate"]
log_file = 'py_log.txt'
file_names = getListOfFiles(dir_name)
with exiftool.ExifTool() as e:
metadata = e.get_tags_batch(tags, file_names)
with open(log_file, "w") as outfile:
json.dump(metadata, outfile)
So what I've pasted is the direct output of the json.dump
method. The get_tags_batch
method is documented here.
Unless I've misunderstood the documentation for this package, it looks like the output is not JSON at all but rather just a string?
Appreciate the pointers and comments.