1

So I am having here one big JSON file which looks like this:

data = {
    "Module1": {
        "Description": "",
        "Layer": "1",
        "SourceDir": "pathModule1",
        "Attributes": {
            "some",
        },
        "Vendor": "comp",
        "components":{
            "Component1": {
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            },
            "Component2":{
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            }
        }
    },
    "Module2": {
        "Description": "",
        "Layer": "2",
        "SourceDir": "pathModule2",
        "Attributes": {
            "some",
        },
        "Vendor": "comp",
        "components":{
            "Component1": {
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            },
            "Component2":{
               "path": "something",
               "includes": [
                   "include1",
                   "include2",
                   "include3",
                   "include4",
                   "include5"
               ]
               "generated:" "txt"
               "memory:" "txt"
               etc
            }
        }
    },
    "Module3": {
        "Description": "",
        "Layer": "3",
        "SourceDir": "path",
        "Attributes": {
            "some",
        },
        "Vendor": "",
    },
    "Module4": {
        "Description": "",
        "Layer": "4",
        "SourceDir": "path",
        "Attributes": {
            "some",
        }
    }
}

I have to go through and take some stuff out of it, so at the end I get this:

Whenever Vendor field is equal to "comp", take that module into consideration, take it's SourceDir filed, all components, their path and includes.

So output would be:

Module1, "pathModule1", components: [Component1, path, [includes: include1, include2 ,include3 ,include4 ,include5 ]], [Component2, path, includes: [include1, include2 ,include3 ,include4 ,include5 ]]

Module2, "pathModule2", components: [Component1, path, [includes: include1, include2 ,include3 ,include4 ,include5 ]], [Component2, path, includes: [include1, include2 ,include3 ,include4 ,include5 ]]

I am really struggling with accessing all the fields that I need.

My current code is this:

with open ("DB.json", 'r') as f:
    modules= json.load(f)

for k in modules.keys():
    try:
        if swc_list[k]["Vendor"] == "comp":
            list_components.append(k)
            sourceDirList.append(swc_list[k]['SourceDir'])
            for i in swc_list[k]['sw_objects']:
                 list_sw_objects.append((swc_list[k]['sw_objects']))
    except KeyError:
        continue

I am managing to get only Module1 and sourceDir, but not Component1, 2 and its attributes.. How can I achieve this?

Thanks!

John
  • 230
  • 2
  • 12

1 Answers1

1

I would start by filtering out the items you're not interested in, by doing something like:

data = {k: v for k,v in data.items() if v.get("Vendor") == "comp"}

This drops all the modules you don't want. It's a bit inefficient, because you're parsing over the dictionary a second time to get data in a format you want, but it's easier to reason about as a first step, which is helpful!

At this point you could iterate over the dictionary again if needed - you would have something like:

{'Module1': {'Attributes': {'some'},
             'Description': '',
             'Layer': '1',
             'SourceDir': 'pathModule1',
             'Vendor': 'comp',
             'components': {'Component1': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'},
                            'Component2': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'}}},
 'Module2': {'Attributes': {'some'},
             'Description': '',
             'Layer': '2',
             'SourceDir': 'pathModule2',
             'Vendor': 'comp',
             'components': {'Component1': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'},
                            'Component2': {'includes': ['include1',
                                                        'include2',
                                                        'include3',
                                                        'include4',
                                                        'include5'],
                                           'path': 'something'}}}}

To get a print out of the source directories and the components only, you could do:

for k,v in data2.items():
    print(k, v["SourceDir"], v["components"])

which would give you:

Module1 pathModule1 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}
Module2 pathModule2 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}

Edit: To refine the output further, you can change the above loop to be:

for k,v in data2.items():
    components = [(comp_name, comp_data["path"], comp_data["includes"]) for comp_name, comp_data in v["components"].items()]
    print(k, v["SourceDir"], components)

which will give you:

Module1 pathModule1 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]
Module2 pathModule2 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]
PirateNinjas
  • 1,908
  • 1
  • 16
  • 21
  • 1
    this is great solution man, gets stuff in just 3 lines.. just one more thing, I actually have more attributes inside of Component1 or Component2 etc... but I need only name of it, path and includes just like they are printed now, but to exclude all other attributes, I will update my question now, sorry for this – John Oct 22 '21 at 09:29
  • Edited to add some more targeted filtering - hopefully that is what you are after? – PirateNinjas Oct 22 '21 at 10:39
  • That is it! Thank you man! Amazing.. I have 50 lines in my try at the moment.. you did it with 3.. – John Oct 22 '21 at 14:20
  • I am having big problems with now trying create some .txt files to store all of these, may I update question or post a new one? – John Oct 25 '21 at 11:06
  • You should open a new question for that bit - edits should only really be made to clarify the question, not to extend it! – PirateNinjas Oct 25 '21 at 11:15
  • That's what I taught, sure – John Oct 25 '21 at 12:07
  • https://stackoverflow.com/questions/69708095/python-parse-nested-json-file-take-out-specific-attributes-and-create-txt-file – John Oct 25 '21 at 12:30
  • I think it this should not be big effort to achieve I just keep getting lot of problems – John Oct 25 '21 at 12:31