1

I have a nested json file and a dictionary(params) for lookup, I would like to replace record dict value where format "<>" is present with lookup.

lookup dictionary: params_lookup = {"population":47, "year":2022, "valid":True, "end":"colls", "num":23, "items":"more"}

record:

record = {"country":"zzzx", "metrics":{"population":"<population> million", "currency":"Euro", "year":"<year>"}, "valid":"<valid>",
          "others":["<num>", 2, 3], "name":"<items>"}

here is my attempt, but it fails when lookup value is integer or boolean.

I understand since replace expects string in the lookup, it fails with TypeError: replace() argument 2 must be str, not int. could you please advise on how to approach the problem.

import re

START_DELIMITER = '<'

END_DELIMITER = '>'

def format_params(d, params_lookup):

    if isinstance(d, list):
        return [format_params(item, params_lookup) for item in d]

    if isinstance(d, dict):
        return {key: format_params(value, params_lookup)
                for key, value in d.items()}

    if isinstance(d, str) and bool(re.search(START_DELIMITER + "\w*" + END_DELIMITER, d)):
        params = re.findall(START_DELIMITER + "(\w*)" + END_DELIMITER, d)

        for param in params:
            if not params_lookup.get(param):
                raise Exception(
                    f" Parameter {START_DELIMITER + param + END_DELIMITER} is not found in the mapping")   
            else:
                d = d.replace(START_DELIMITER + param + END_DELIMITER, params_lookup.get(param))

        return d

    else:
        return d

format_params(record, params_lookup)


desired result:

record = {"country":"zzzx", "metrics":{"population":"47 million", "currency":"Euro", "year": 2022}, "valid":True,
          "others":[23, 2, 3], "name":"more"}
kites
  • 1,375
  • 8
  • 15

1 Answers1

1

That is because some of the values in the params dict are ints. You can just cast it to string.

d = d.replace(START_DELIMITER + param + END_DELIMITER, str(params.get(param)))

There are a few more issues with the code. The line

        params = re.findall(START_DELIMITER + "(\w*)" + END_DELIMITER, d)

is not needed.

The entire code can simply be,

def format_params(d, params):
  if isinstance(d, dict):
     return {k: format_params(v, params) for k,v in d.items()}
  if isinstance(d, list):
     return [format_params(item, params) for item in d]
  elif isinstance(d, str):
     for pk, pv in params.items():
       if d == f"<{pk}>":
         d = pv
         break
       else:
         d = d.replace(f"<{pk}>", str(pv))
     return d
  return d
Shanavas M
  • 1,581
  • 1
  • 17
  • 24
  • Thanks for your post, but your code output is not same as desired result, if you see, integer and boolean lookups are converted to string in desired result – kites Jan 30 '22 at 14:07
  • That's a bit tricky. Added a work around for that. – Shanavas M Jan 30 '22 at 14:12