0

I am trying out the following function to generate below csv file, however, I am not sure how to index the dynamic[] to get values matching to specific elements in json input file.

public static void Json_to_Csv(string jsonInputFile, string csvFile)
{
    using (var p = new ChoJSONReader(jsonInputFile).WithJSONPath("$..readResults")) // "readResults": [
    {
        using (var w = new ChoCSVWriter(csvFile).WithFirstLineHeader())
        {
            w.Write(p
                .SelectMany(r1 => ((dynamic[])r1.lines)
                .Select(r2 => new
                {
                    FileName = inputFile,
                    Page = r1.page,
                    PracticeName = r2.text,
                    //OwnerFullName = r2.text.'Owner Full Name',
                    //OwnerEmail = r2.text.'Owner Email'
                })));
        }
    }
}

I expect the following:

File Name,Page,Practice Name,Owner Full Name,Owner Email
file1.json,1,Some Practice Name,Bob Lee,Bob@gmail.com

but i am currently getting the following:

FileName,Page,PracticeName
file1.json,1,Account Information
file1.json,1,Practice Name
file1.json,1,Some Practice Name
file1.json,1,Owner Full Name
file1.json,1,Bob Lee
file1.json,1,Owner Email
file1.json,1,Bob@gmail.com

I tried

PracticeName = r2.'Practice Name',
OwnerFullName = r2.text.'Owner Full Name',
OwnerEmail = r2.text.'Owner Email'

But these result in error:

Invalid anonymous member type declarator
The name text does not appear in this context

file1.json sample:

{
  "status": "succeeded",
  "createdDateTime": "2020-10-22T19:35:35Z",
  "lastUpdatedDateTime": "2020-10-22T19:35:36Z",
  "analyzeResult": {
    "version": "3.0.0",
    "readResults": [
      {
        "page": 1,
        "angle": 0,
        "width": 8.5,
        "height": 11,
        "unit": "inch",
        "lines": [          
          {
            "boundingBox": [
              0.5016,
              1.9141,
              2.5726,
              1.9141,
              2.5726,
              2.0741,
              0.5016,
              2.0741
            ],           
           "text": "Account Information",
            "words": [
              {
                "boundingBox": [
                  0.5016,
                  1.9345,
                  1.3399,
                  1.9345,
                  1.3399,
                  2.0741,
                  0.5016,
                  2.0741
                ],
                "text": "Account",
                "confidence": 1
              },
              {
                "boundingBox": [
                  1.3974,
                  1.9141,
                  2.5726,
                  1.9141,
                  2.5726,
                  2.0741,
                  1.3974,
                  2.0741
                ],
                "text": "Information",
                "confidence": 1
              }
            ]
          },
          {
            "boundingBox": [
              1.7716,
              2.4855,
              2.8793,
              2.4855,
              2.8793,
              2.6051,
              1.7716,
              2.6051
            ],
            "text": "Practice Name",
            "words": [
              {
                "boundingBox": [
                  1.7716,
                  2.4855,
                  2.3803,
                  2.4855,
                  2.3803,
                  2.6051,
                  1.7716,
                  2.6051
                ],
                "text": "Practice",
                "confidence": 1
              },
              {
                "boundingBox": [
                  2.4362,
                  2.4948,
                  2.8793,
                  2.4948,
                  2.8793,
                  2.6051,
                  2.4362,
                  2.6051
                ],
                "text": "Name",
                "confidence": 1
              }
            ]
          },
          {
            "boundingBox": [
              2.9993,
              2.5257,
              4.7148,
              2.5257,
              4.7148,
              2.714,
              2.9993,
              2.714
            ],
            "text": "Some Practice Name",
            "words": [
              {
                "boundingBox": [
                  3.0072,
                  2.5385,
                  3.6546,
                  2.5284,
                  3.6516,
                  2.7131,
                  3.0105,
                  2.712
                ],
                "text": "Some",
                "confidence": 0.984
              },
              {
                "boundingBox": [
                  3.6887,
                  2.5281,
                  4.2112,
                  2.5262,
                  4.2028,
                  2.7159,
                  3.6854,
                  2.7132
                ],
                "text": "Parctice",
                "confidence": 0.986
              },
              {
                "boundingBox": [
                  4.2453,
                  2.5263,
                  4.7223,
                  2.5297,
                  4.7091,
                  2.72,
                  4.2366,
                  2.7161
                ],
                "text": "Name",
                "confidence": 0.986
              }
            ]
          },
          {
            "boundingBox": [
              1.6116,
              2.9999,
              2.8816,
              2.9999,
              2.8816,
              3.1158,
              1.6116,
              3.1158
            ],
            "text": "Owner Full Name",
            "words": [
              {
                "boundingBox": [
                  1.6116,
                  3.0039,
                  2.1026,
                  3.0039,
                  2.1026,
                  3.1157,
                  1.6116,
                  3.1157
                ],
                "text": "Owner",
                "confidence": 1
              },
              {
                "boundingBox": [
                  2.1541,
                  2.9999,
                  2.3784,
                  2.9999,
                  2.3784,
                  3.1158,
                  2.1541,
                  3.1158
                ],
                "text": "Full",
                "confidence": 1
              },
              {
                "boundingBox": [
                  2.4384,
                  3.0052,
                  2.8816,
                  3.0052,
                  2.8816,
                  3.1155,
                  2.4384,
                  3.1155
                ],
                "text": "Name",
                "confidence": 1
              }
            ]
          },
          {
            "boundingBox": [
              2.9993,
              3.0242,
              3.6966,
              3.0242,
              3.6966,
              3.2125,
              2.9993,
              3.2014
            ],
            "text": "Bob Lee",
            "words": [
              {
                "boundingBox": [
                  3.0063,
                  3.0303,
                  3.3439,
                  3.0349,
                  3.3461,
                  3.2125,
                  3.007,
                  3.2081
                ],
                "text": "Bob",
                "confidence": 0.987
              },
              {
                "boundingBox": [
                  3.3788,
                  3.0349,
                  3.6931,
                  3.0326,
                  3.697,
                  3.2121,
                  3.3813,
                  3.2125
                ],
                "text": "Lee",
                "confidence": 0.983
              }
            ]
          },
          {
            "boundingBox": [
              1.945,
              3.5063,
              2.8748,
              3.5063,
              2.8748,
              3.6261,
              1.945,
              3.6261
            ],
            "text": "Owner Email",
            "words": [
              {
                "boundingBox": [
                  1.945,
                  3.5143,
                  2.4359,
                  3.5143,
                  2.4359,
                  3.6261,
                  1.945,
                  3.6261
                ],
                "text": "Owner",
                "confidence": 1
              },
              {
                "boundingBox": [
                  2.4874,
                  3.5063,
                  2.8748,
                  3.5063,
                  2.8748,
                  3.6259,
                  2.4874,
                  3.6259
                ],
                "text": "Email",
                "confidence": 1
              }
            ]
          },
          {
            "boundingBox": [
              3.0104,
              3.5005,
              4.6042,
              3.5005,
              4.6042,
              3.6888,
              3.0104,
              3.6777
            ],
            "text": "bob@gmail.com",
            "words": [
              {
                "boundingBox": [
                  3.0212,
                  3.5047,
                  4.5837,
                  3.5039,
                  4.5769,
                  3.6886,
                  3.0129,
                  3.6787
                ],
                "text": "bob@gmail.com",
                "confidence": 0.951
              }
            ]
          }
        ]
      }
    ]
  }
}
Cinchoo
  • 6,088
  • 2
  • 19
  • 34
Cataster
  • 3,081
  • 5
  • 32
  • 79
  • What more do I need to clarify? I've laid out exactly what I'm looking for, what I've tried and my code. – Cataster Oct 23 '20 at 03:48

1 Answers1

1

Based on JSON structure, and your expected CSV output, here is how you can do it

StringBuilder csv = new StringBuilder();
using (var p = new ChoJSONReader("*** YOUR JSON FILE ***")
    .WithJSONPath("$..readResults")
    )
{
    using (var w = new ChoCSVWriter(csv)
        .WithFirstLineHeader()
        )
    {
        w.Write(p
            .Select(r1 =>
            {
                var lines = (dynamic[])r1.lines;
                return new
                {
                    FileName = "file1.json",
                    Page = r1.page,
                    PracticeName = lines[2].text,
                    OwnerFullName = lines[4].text,
                    OwnerEmail = lines[6].text,
                };
            }
    ));
    }
}

Console.WriteLine(csv.ToString());
Cinchoo
  • 6,088
  • 2
  • 19
  • 34
  • This works great :) but i was hoping theres a way to index them by element name. the reason for this is lets suppose i have another json file, where the ordering of the elements is a little different. then what i would get from say `PracticeName = lines[2].text,` is another text/element different from what im actually looking for, the Practice Name. I was hoping to be able to do something like this: `PracticeName = lines['Practice Name'].text,`, but i get "Too many characters in character literal" error – Cataster Oct 23 '20 at 15:32
  • I can't seem to find a way pull them by other means from JSON. There is no other contextual information on each item to reference them other than by index. – Cinchoo Oct 23 '20 at 15:40
  • np. this is good for now but if there is a better way to make it indexable by element name that would be preferable. which is why i was trying it this way `PracticeName = r2.'Practice Name', OwnerFullName = r2.text.'Owner Full Name', OwnerEmail = r2.text.'Owner Email'` but of course that doesnt work. – Cataster Oct 23 '20 at 15:50
  • hey I thought about it, do you know how we can limit the page number? so another reason why it wouldve been nice to index by element name is that the JSON may have multiple pages, but the elements in the generated csv, e.g. `Practice Name, Owner Full Name` only conform to values matching these elements which are just in the first page. However, I am currently getting page2 for a 2nd row in the csv (as expected), with values that dont match with the element names in the csv because those elements arent in page2 of the json. if the reading results can be limited to page1, that would be helpful. – Cataster Oct 23 '20 at 19:51
  • just add `where` clause to limit the result to page 1. `w.Write(p.Where(r1 => r1.page == 1).Select(r1 =>` – Cinchoo Oct 23 '20 at 20:08
  • perfect! i added some extra fields, such as `Phone Number`. however, i noticed if the Phone Number is purely numerical, such as `6013456941` in the JSON file, it shows up as a scientific number in the csv unless i expand the cell field manually. if the Number is like this however: `(601) 345-6941` thats how i see it in the csv file. is there a way not to make the phone number turn scientific if its purely numerical in the JSON? would it be like this `PhoneNumber = (int)lines[12].text,`? – Cataster Oct 23 '20 at 20:18
  • So over the last 2 days I've been looking into dynamic deserialization. However, in this json, "Owner Full Name" and "Bob Lee" for example are not the relationship between attributes and values, but the values ​​of Text attributes in two unrelated objects. We can't establish a connection between them, except to specify manually as in your code. So I guess choJSON is truly the only way, and I truly thank you for your help :) – Cataster Oct 26 '20 at 03:10