0

I struggled with this yesterday afternoon and think I have come up with a clever solution but looking for feedback on how to improve it.

The scenario: I am running ffprobe on media files and getting back the JSON dictionary from ffprobe and storing it in a MongoDB collection linked to the mongo document for the file.

The problem: Some media file types give back key names in the JSON that are incompatible with the BSON documents in Mongo. For example, the following perfectly valid JSON cannot be stored in Mongo as is due to the keys in the tags dictionary:

"format": {
    "filename": "ToS-4k-1920_CMO_freezeframe6308.mov",
    "nb_streams": 2,
    "nb_programs": 0,
    "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
    "format_long_name": "QuickTime / MOV",
    "start_time": "0.000000",
    "duration": "738.941667",
    "size": "14542021084",
    "bit_rate": "157436200",
    "probe_score": 100,
    "tags": {
        "major_brand": "qt  ",
        "minor_version": "537199360",
        "compatible_brands": "qt  ",
        "creation_time": "2018-01-15T18:07:07.000000Z",
        "com.apple.quicktime.player.movie.audio.gain": "1.000000",
        "com.apple.quicktime.player.movie.audio.treble": "0.000000",
        "com.apple.quicktime.player.movie.audio.bass": "0.000000",
        "com.apple.quicktime.player.movie.audio.balance": "0.000000",
        "com.apple.quicktime.player.movie.audio.pitchshift": "0.000000",
        "com.apple.quicktime.player.movie.audio.mute": "",
        "com.apple.quicktime.player.movie.visual.brightness": "0.000000",
        "com.apple.quicktime.player.movie.visual.color": "1.000000",
        "com.apple.quicktime.player.movie.visual.tint": "0.000000",
        "com.apple.quicktime.player.movie.visual.contrast": "1.000000",
        "com.apple.quicktime.player.version": "7.6.6 (7.6.6)",
        "com.apple.quicktime.version": "7.7.3 (2943.14) 0x7738000 (Mac OS X, 10.11.6, 15G18013)"
    }
}

The solution? I wrote a recursive function to parse the dictionary updating the keys but it is bad mojo to update a dictionary you are iterating in so I tricked the system by getting a list of all the keys and interating through that so that I could update the keys from outside the dictionary. here is my function and how I called it. Feedback?

def key_string_replace(dictionary, findit, replaceit):
    for k in list(dictionary.keys()):
        if findit in k:
            newkey = k.replace(findit, replaceit)
            dictionary[newkey] = dictionary.pop(k)
            k = newkey
        else:
            pass
        if isinstance(dictionary[k], dict):
            key_string_replace(dictionary[k], findit, replaceit)
        elif isinstance(dictionary[k], list):
            for l in dictionary[k]:
                if isinstance(l, dict):
                    key_string_replace(l, findit, replaceit)

from subprocess import Popen, PIPE
cmd = "ffprobe -v quiet -print_format json -show_streams -show_format"
args = shlex.split(cmd)
args.append(pathToInputVideo)
# run the ffprobe process, decode stdout into utf-8 & convert to JSON
p = Popen (args, stdout=PIPE, stderr=PIPE)
output, error = p.communicate()
if p.returncode == 0: 
    ffprobeOutput = output.decode('utf-8')
    ffprobeOutput = json.loads(ffprobeOutput)
    # fix any bad keys in ffprobe json
    key_string_replace(ffprobeOutput, '.', '_')
Paul Jacobs
  • 430
  • 5
  • 11
  • 3
    Hi Paul, I don't post enough in CR to recommend migration, but usually StackOverflow is for *bugs* or *problems* in code. Check out CodeReview if your code is working and you just want feedback. If this is about a bug or problem, then it's not particularly clear to me what that is, as written – en_Knight Aug 09 '18 at 17:09

1 Answers1

0

I would iterate the dictionary and either clone it with the proper keys using a check to validate the key in the initial dict is valid, else copy with a transformation to the key. If you dont want to copy the instance, and insist on in place I would add another element to the dictionary with the valid key, and then delete the invalid element and move on, adding to the dictionary wont matter as long as the new additions pass a test for key validity. Also to en_Knight's point, check out CR for review :)

Theodore Howell
  • 419
  • 3
  • 12