1

I am trying to make something like an Explode function for a json File. The loop should get a json file line by line and in each line I have multiple values that i want to extract out of this line and put it together with the main line (like lateral view or Explode function in SQL)

The Data looks like this

{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717,"wl_key3":589101,"wl_key4":23095,"wl_key5":200527,"wl_key6":60319}

now what I want is like in SQL Explode this

{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key3":589101}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key4":23095}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key5":200527}


 import io
 import sys
 import re

 i = 0
 with io.open('lateral_result.json', 'w', encoding="utf-8") as f, io.open('lat.json', encoding="utf-8") as g:
for line in g:
    x = re.search('(.*wl_timestamp":"[^"]+",)', line)
    y = re.search('("wl_key[^,]+),', line)
    for y in line:
        i = i + 1
        print (x.group(0), y.group(i),'}', file=f)    

I get all the time an Error that I cant get a str as group, but when I put the Regex down in the next for loop it just gets me the first result and does nothing or in another way it just takes the same results and writes it as often as it finds a character in the line.

Vedad
  • 223
  • 4
  • 15
  • Why do you use regex to parse json? Use json.load() and inspect the created datatructures? [What is a XY-Problem?](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) – Patrick Artner Jan 20 '19 at 10:34
  • The tags _explode_ and _lateral_ are misleading - explode is PHP not python and lateral is only watched by 3 ppl - better tag [tag:python] in addition to python-3.x. By tagging _explode_ you target PHP devs that can't really help with python. – Patrick Artner Jan 20 '19 at 10:35

2 Answers2

2

Dont use regex on json - use json on json and operate the data structure:

import json

data_str = """{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717,"wl_key3":589101,"wl_key4":23095,"wl_key5":200527,"wl_key6":60319}"""

data = json.loads(data_str)  # you can use json.load( file_handle )

print(data)

for k in (x for x in data.keys() if x.startswith("wl_key")):
    print(data["wl_timestamp"],k,data[k])

Output:

2013-01-27 16:07:02 wl_key2 103717
2013-01-27 16:07:02 wl_key3 589101
2013-01-27 16:07:02 wl_key4 23095
2013-01-27 16:07:02 wl_key5 200527
2013-01-27 16:07:02 wl_key6 60319
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
0

Here the code that solves my Case

import json
import io
import sys
import re

with io.open('lateral_result.json', 'w', encoding="utf-8") as f, io.open('lat.json', encoding="utf-8") as g:
    for line in g:
        l = str(line)
        data = json.loads(l)  
        for k in (x for x in data.keys() if x.startswith("wl_key")):
             x = re.search('(.*wl_timestamp":"[^"]+",")', line)
             print(x.group(0)+str(k)+'":'+str(data[k])+'}', file=f)
Vedad
  • 223
  • 4
  • 15