read many json files to find common key value pair python

Question

i have a list of paths to json files.

files = ['/Users/sbm/Downloads/ds214mb/sub-EESS001/sub-EESS001_task-Cyberball_bold.json',
 '/Users/sbm/Downloads/ds214mb/sub-EESS002/func/sub-EESS002_task-Cyberball_bold.json',
 '/Users/sbm/Downloads/ds214mb/sub-EESS003/sub-EESS003_task-Cyberball_bold.json',
 '/Users/sbm/Downloads/ds214mb/sub-EESS004/func/sub-EESS004_task-Cyberball_bold.json',
 '/Users/sbm/Downloads/ds214mb/sub-EESS005/sub-EESS005_task-Cyberball_bold.json',
 '/Users/sbm/Downloads/ds214mb/sub-EESS006/sub-EESS006_task-Cyberball_bold.json',
 '/Users/sbm/Downloads/ds214mb/sub-EESS007/func/sub-EESS007_task-Cyberball_bold.json',
 '/Users/sbm/Downloads/ds214mb/sub-EESS008/func/sub-EESS008_task-Cyberball_bold.json']

Now i intend to read all these files into dictionaries with same name as filename or diff name. And then iterate through those dict to find common key value pair.

I did the following to read all json files to diff dict. Now what would be an efficient way to compare all these dict to find common key: value pair?

import json
for file in range(0, len(files)):
    globals()['json%s' % file] = "Hello"

i = 0
for file in files:
    globals()['json%s' % i] = json.loads(open(file).read())
    i = i+1

sample json file looks like:

{
 'Manufacturer': 'Siemens',
 'ManufacturerModelName': 'Magnetom Verio',
 'RepetitionTime': 1.56,
 'SliceTiming': [0.0,
  0.78,
  0.06,
  0.84,
  0.12],
 'TaskName': 'Cyberball'}

if you could organize the dicts into a list, check this other answer out: http://stackoverflow.com/questions/9906944/python-find-only-common-key-value-pairs-of-several-dicts-dict-intersection — stackunderflow, Mar 02 '17 at 20:58
Look here maybe http://stackoverflow.com/questions/25851183/how-to-compare-two-json-objects-with-the-same-elements-in-a-different-order-equa — oshaiken, Mar 02 '17 at 21:31

score 1 · Answer 1 · answered Mar 02 '17 at 21:49

Interesting question....

I start with piping a list of JSON Files ....

find <dir> | grep json$

That pipe gets sent to a python program....

So this now looks like

find <dir> | grep json$ | python t.py

The python code does the following

Opens the file
Reads the file
JSON Parses into Python Dictionary
Outputs the python Dictionary

So this looks like this (Python3 code)

import json,sys,pprint
for file in sys.stdin:
  file=file.strip('\n')
  with open(file,"rt") as ifp:
    b=ifp.read()
    b=(b.replace('\n','')).replace("'","\"")
  ifp.close()
  c=json.loads(b)
  for k,v in c.items():
    print('{}:{}'.format(k,v))

We now sort and count the output using bash... which generically looks like this...

sort | uniq -c | sort -n

So putting all this together we get ... (I am assuming all the JSON in same directory as I am at the moment)

ls *.json | python t.py | sort  | uniq -c  | sort -n

If you want the top 5 - it becomes

ls *.json | python t.py | sort  | uniq -c  | sort -n | head -n 5

here the issue is i intend to do this in python – learnningprogramming Mar 03 '17 at 18:37 — learnningprogramming, Mar 03 '17 at 18:37
Ok ... I will post a full Python way – Tim Seed Mar 04 '17 at 06:15 — Tim Seed, Mar 04 '17 at 06:15

score 1 · Answer 2 · answered Mar 04 '17 at 06:34

Only in python - no linux

files=['data1.json','data2.json','data3.json']
master_key_plus_value={}
import json,sys,pprint
for file in files:
  with open(file,"rt") as ifp:
    b=ifp.read()
    b=(b.replace('\n','')).replace("'","\"")
  ifp.close()
  c=json.loads(b)
  for k,v in c.items():
    if str(k)+': '+str(v) in master_key_plus_value:
        master_key_plus_value[str(k)+': '+str(v)] += 1
    else:
        master_key_plus_value[str(k)+': '+str(v)] = 1

#Now we have ready all the key + values into a single dictionary
#Sort by the value (occurance)



master_key

sorted_dictionary = sorted(master_key_plus_value.items(), key=lambda x: -x[1])

print("Most Common Key-Value is  {} Occurance {} ".format(sorted_dictionary[0][0],sorted_dictionary[0][1]))

Same principles for each file read JSON file as text reformat and make Json object which gives a python dictionary combine key + value and compare to a master dictionary add 1 to value if there else store and set value to 1 Finally Sort on value descending print top element ([0]) which is a tuple hence it is [0][0] and [0][1]

read many json files to find common key value pair python

2 Answers2