2

I'm currently working on a script that rearranges JSON data in a more basic manner so I can run it through another YOLO box plot script. So far I've managed to make the script print the data in the exact format that I wish for. However, I would like to save it in a text file so I don't have to copy/paste it every time. Doing this seemed to be more difficult than first anticipated.

So here is the code that currently "works":

import sys

data    = open(sys.argv[1], 'r')

with data as file:
    for line in file:
        split = line.split()
        if split[0] == '"x":':
            print("0", split[1][0:8], end = ' ')
        if split[0] == '"y":':
            print(split[1][0:8], end = ' ')
        if split[0] == '"w":':
            print(split[1][0:8], end = ' ')
        if split[0] == '"h":':
            print(split[1][0:8]) 

And here is an example of the dataset that will be run through this script:

{
    "car": {
        "count": 7,
        "instances": [
            {
                "bbox": {
                    "x": 0.03839285671710968,
                    "y": 0.8041666746139526,
                    "w": 0.07678571343421936,
                    "h": 0.16388888657093048
                },
                "confidence": 0.41205787658691406
            },
            {
                "bbox": {
                    "x": 0.9330357313156128,
                    "y": 0.8805555701255798,
                    "w": 0.1339285671710968,
                    "h": 0.2222222238779068
                },
                "confidence": 0.8200334906578064
            },
            {
                "bbox": {
                    "x": 0.15803571045398712,
                    "y": 0.8111110925674438,
                    "w": 0.22678571939468384,
                    "h": 0.21111111342906952
                },
                "confidence": 0.8632314801216125
            },
            {
                "bbox": {
                    "x": 0.762499988079071,
                    "y": 0.8916666507720947,
                    "w": 0.1428571492433548,
                    "h": 0.20555555820465088
                },
                "confidence": 0.8819259405136108
            },
            {
                "bbox": {
                    "x": 0.4178571403026581,
                    "y": 0.8902778029441833,
                    "w": 0.17499999701976776,
                    "h": 0.17499999701976776
                },
                "confidence": 0.8824222087860107
            },
            {
                "bbox": {
                    "x": 0.5919643044471741,
                    "y": 0.8722222447395325,
                    "w": 0.16607142984867096,
                    "h": 0.25
                },
                "confidence": 0.8865317106246948
            },
            {
                "bbox": {
                    "x": 0.27767857909202576,
                    "y": 0.8541666865348816,
                    "w": 0.2053571492433548,
                    "h": 0.1805555522441864
                },
                "confidence": 0.8922017216682434
            }
        ]
    }
}

The outcome will be looking like this:

0 0.038392 0.804166 0.076785 0.163888
0 0.933035 0.880555 0.133928 0.222222
0 0.158035 0.811111 0.226785 0.211111
0 0.762499 0.891666 0.142857 0.205555
0 0.417857 0.890277 0.174999 0.174999
0 0.591964 0.872222 0.166071 0.25
0 0.277678 0.854166 0.205357 0.180555

Instead of printing these lines I've tried writing them to a new text file, however, I keep getting the "ValueError: I/O operation on closed file." error. I would guess this is because I already have one open and opening a new one will close the first one? Is there an easy way to work around this? Or is the hassle too much to bother and copy/pasting the print result is the "easiest" way?

Baowz
  • 23
  • 4
  • I'd avoid carrying out operations while reading: try storing everything into a string and then work with the variable – lemon Apr 20 '22 at 20:28
  • You can add `, file=out` to every `print` function to have it redirect to a file. Or, you can build up the result as a list of strings, and write the strings when you're done. – Tim Roberts Apr 20 '22 at 20:29
  • You need to post the code that's getting the error. The error message implies that you're calling `outfile.close()` inside the loop. Don't do that. – Barmar Apr 20 '22 at 20:31
  • Why do you manually parse your json file when the `json` module exists to do it for you? – Pranav Hosangadi Apr 20 '22 at 21:16
  • There *is* a limit to how many files you can have open at once - but it's much, much greater than one. – jasonharper Apr 20 '22 at 21:18

2 Answers2

3

Why don't you use the json and csv packages??

import csv
import json

# import sys
# file = sys.argv[1]

file = "input.json"
output_file = "output.csv"

with open(file, "r") as data_file:

    data = json.load(data_file)

    with open(output_file, "w") as csv_file:
        
        writer = csv.writer(csv_file, delimiter=' ')

        for value in data.values():

            instances = value.get("instances")
            bboxes = [instance.get("bbox") for instance in instances]

            for bbox in bboxes:
                
                writer.writerow([
                    0, 
                    f"{bbox['x']:.6f}",
                    f"{bbox['y']:.6f}",
                    f"{bbox['w']:.6f}",
                    f"{bbox['h']:.6f}",
                ])

Output:

0 0.038393 0.804167 0.076786 0.163889
0 0.933036 0.880556 0.133929 0.222222
0 0.158036 0.811111 0.226786 0.211111
0 0.762500 0.891667 0.142857 0.205556
0 0.417857 0.890278 0.175000 0.175000
0 0.591964 0.872222 0.166071 0.250000
0 0.277679 0.854167 0.205357 0.180556

Notes:

  • It's important that you understand your input file format you are working with. Read about JSON here.
  • I do round the values to 6 digits in both examples (not sure what the requirements are but simply modify f"{bbox['x']:.6f}" and the 3 lines following that one to your use case)

Or, if you want to use jmespath along with csv and json:

import csv
import json
import jmespath  # pip install jmespath

# import sys
# file = sys.argv[1]

file = "input.json"
output_file = "output.csv"

with open(file, "r") as data_file:

    data = json.load(data_file)
    
    bboxes = jmespath.search("*.instances[*].bbox", data)

    with open(output_file, "w") as csv_file:

        writer = csv.writer(csv_file, delimiter=' ')

        for bbox in bboxes[0]:
            
            writer.writerow([
                0, 
                f"{bbox['x']:.6f}",
                f"{bbox['y']:.6f}",
                f"{bbox['w']:.6f}",
                f"{bbox['h']:.6f}",
            ])
Thomas
  • 8,357
  • 15
  • 45
  • 81
  • This is exactly what I wanted, thank you so much for your help! I would like to note one thing: When using "csv.writer" it creates an empty line between every entry. This would not work for my purpose, but that can easily be fixed by adding " newline='' " in the open command. – Baowz Apr 25 '22 at 14:10
1

I suggest parsing the file as JSON rather than raw text. If the file is JSON, treat it as JSON in order to avoid the unfortunate case in which it is valid, minified JSON and the lack of line breaks makes treating it as a string a nightmare of regexes that are likely fragile. Or possibly worse, the file is invalid JSON.

import json
import sys
with open(sys.argv[1], 'r') as f:
    raw = f.read()
obj = json.loads(raw)
print("\n".join(
    f"0 {i['bbox']['x']:.6f} {i['bbox']['y']:.6f} {i['bbox']['w']:.6f} {i['bbox']['h']:.6f}"
         for i in obj["car"]["instances"])
)
Michael Ruth
  • 2,938
  • 1
  • 20
  • 27