1

I have the following code that loops through json files in a directory and creates a csv file with following records:

results.csv

File Name    Page  Practice Name
fileXYZ.json 1     XYZ & Co
fileAB2.json 1     ABC & Co
file1.json   1     Associates & Co

However, if i stop execution and rerun the program again, what happens is that the same records get inserted in the csv file again, resulting in:

File Name    Page  Practice Name
fileXYZ.json 1     XYZ & Co
fileAB2.json 1     ABC & Co
file1.json   1     Associates & Co
fileXYZ.json 1     XYZ & Co
fileAB2.json 1     ABC & Co
file1.json   1     Associates & Co 

How do i check if the record already exists (i.e. each field is the same of the fields being inserted) and replace it (or basically dont append it again?) for example, if i were to run the program again because there was a change in file1.json and also because there was a new file added to the directory, the new csv should look like this:

results.csv:

File Name    Page  Practice Name
fileXYZ.json 1     XYZ & Co
fileAB2.json 1     ABC & Co
file1.json   1     Corpum & Co
file32.json  1     FirmA

code:

using ChoETL;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

static void Main(string[] args)
{
    //Output to CSV
    foreach (var jsonFile in Directory.GetFiles(jsonFilesPath))
    {
        JsonToCsv(jsonFile, csvFilePath);
    }
}
public static string fieldValue(IEnumerable<dynamic> lines, string nameOfField, bool throwException = false)
{
    var skipped = lines.SkipWhile(l => l.text != nameOfField);

    switch (throwException)
    {
        case true:
            var enumerator = lines.GetEnumerator();
            
            while (enumerator.MoveNext())
            {
                if (skipped.Count() == 0)
                    return skipped.Skip(1).First().text;
                else
                    throw new InvalidDataException("Odd number of items found in IEnumerable<>");
            }

            break;
        case false:
            // Skip(#) to skip over the unnecessary Lines, 
            // such as "Account Information", preceding "Practice Name".
            return skipped.Skip(1).First().text;

            break;
        default:
            Console.WriteLine("Default case");
            break;
    }
    // Returning null isn't recommended, but it does fix the error "not all code paths return a value"
    return null;
}
public static void JsonToCsv(string jsonInputFile, string csvFile)
{
    using (var p = new ChoJSONReader(jsonInputFile).WithJSONPath("$..readResults"))
    {
        using (var fs = new FileStream(csvFile, FileMode.Append, FileAccess.Write))
        {
             using (var writer = new ChoCSVWriter(fs))
             {
                writer.WithField("FileName", fieldName: "File Name")
                    .WithField("Page")
                    .WithField("PracticeName", fieldName: "Practice Name");

                if (fs.Position == 0) // we don't need header if file already existed before
                {
                    writer.WithFirstLineHeader();
                }
                
                // Limit the result to page 1 since the fields below only exist on the 1st page
                writer.Write(p
                    .Where(r1 => r1.page == 1)
                    .Select(r1 =>
                    {
                        var lines = (dynamic[])r1.lines;
                        return new
                        {
                            FileName = jsonInputFile,
                            Page = r1.page,
                            PracticeName = //lines[6].text,
                                fieldValue(lines, "Practice Name"),
                        };
                    }
                ));

                }

                fs.Write(Environment.NewLine); // append new line carrier so we don't write to the same line when file reopened for writing
        }
    }
}

Sample JSON file

{
  "status": "succeeded",
  "createdDateTime": "2020-10-30T15:56:11Z",
  "lastUpdatedDateTime": "2020-10-30T15:56:12Z",
  "analyzeResult": {
    "version": "3.0.0",
    "readResults": [
      {
        "page": 1,
        "angle": 0.086,
        "width": 684,
        "height": 272,
        "unit": "pixel",
        "lines": [
          {
            "boundingBox": [
              7,
              6,
              196,
              5,
              196,
              24,
              7,
              25
            ],
            "text": "Account Information",
            "words": [
              {
                "boundingBox": [
                  10,
                  7,
                  83,
                  7,
                  81,
                  24,
                  7,
                  26
                ],
                "text": "Account",
                "confidence": 0.981
              },
              {
                "boundingBox": [
                  87,
                  7,
                  196,
                  6,
                  196,
                  24,
                  85,
                  24
                ],
                "text": "Information",
                "confidence": 0.939
              }
            ]
          },
          {
            "boundingBox": [
              120,
              56,
              223,
              57,
              223,
              70,
              120,
              70
            ],
            "text": "Practice Name",
            "words": [
              {
                "boundingBox": [
                  120,
                  57,
                  176,
                  57,
                  176,
                  70,
                  120,
                  71
                ],
                "text": "Practice",
                "confidence": 0.982
              },
              {
                "boundingBox": [
                  179,
                  57,
                  222,
                  57,
                  222,
                  71,
                  179,
                  70
                ],
                "text": "Name",
                "confidence": 0.985
              }
            ]
          },
          {
            "boundingBox": [
              236,
              62,
              390,
              62,
              390,
              77,
              236,
              77
            ],
            "text": "Some Practice Name",
            "words": [
              {
                "boundingBox": [
                  236,
                  62,
                  277,
                  62,
                  277,
                  78,
                  236,
                  78
                ],
                "text": "Some",
                "confidence": 0.987
              },
              {
                "boundingBox": [
                  280,
                  62,
                  340,
                  62,
                  341,
                  78,
                  280,
                  77
                ],
                "text": "Practice",
                "confidence": 0.984
              },
              {
                "boundingBox": [
                  343,
                  62,
                  390,
                  62,
                  390,
                  78,
                  344,
                  78
                ],
                "text": "Name",
                "confidence": 0.987
              }
            ]
          },
          {
            "boundingBox": [
              107,
              102,
              223,
              102,
              223,
              115,
              107,
              115
            ],
            "text": "Owner Full Name",
            "words": [
              {
                "boundingBox": [
                  108,
                  103,
                  151,
                  102,
                  151,
                  116,
                  107,
                  116
                ],
                "text": "Owner",
                "confidence": 0.985
              },
              {
                "boundingBox": [
                  154,
                  102,
                  177,
                  102,
                  176,
                  116,
                  153,
                  116
                ],
                "text": "Full",
                "confidence": 0.954
              },
              {
                "boundingBox": [
                  180,
                  102,
                  224,
                  103,
                  223,
                  116,
                  179,
                  116
                ],
                "text": "Name",
                "confidence": 0.987
              }
            ]
          },
          {
            "boundingBox": [
              237,
              104,
              298,
              104,
              298,
              119,
              237,
              119
            ],
            "text": "Bob Lee",
            "words": [
              {
                "boundingBox": [
                  238,
                  104,
                  266,
                  104,
                  266,
                  119,
                  238,
                  120
                ],
                "text": "Bob",
                "confidence": 0.987
              },
              {
                "boundingBox": [
                  269,
                  104,
                  298,
                  105,
                  298,
                  120,
                  269,
                  119
                ],
                "text": "Lee",
                "confidence": 0.987
              }
            ]
          },
          {
            "boundingBox": [
              136,
              147,
              223,
              147,
              223,
              160,
              137,
              161
            ],
            "text": "Owner Email",
            "words": [
              {
                "boundingBox": [
                  137,
                  148,
                  181,
                  147,
                  181,
                  161,
                  137,
                  162
                ],
                "text": "Owner",
                "confidence": 0.985
              },
              {
                "boundingBox": [
                  184,
                  147,
                  224,
                  147,
                  224,
                  161,
                  184,
                  161
                ],
                "text": "Email",
                "confidence": 0.985
              }
            ]
          },
          {
            "boundingBox": [
              239,
              144,
              361,
              144,
              361,
              162,
              239,
              162
            ],
            "text": "bob@gmail.com",
            "words": [
              {
                "boundingBox": [
                  240,
                  145,
                  362,
                  146,
                  361,
                  163,
                  240,
                  163
                ],
                "text": "bob@gmail.com",
                "confidence": 0.974
              }
            ]
          },
          {
            "boundingBox": [
              137,
              193,
              224,
              193,
              224,
              208,
              137,
              208
            ],
            "text": "Server Setup",
            "words": [
              {
                "boundingBox": [
                  137,
                  194,
                  179,
                  194,
                  179,
                  208,
                  137,
                  208
                ],
                "text": "Server",
                "confidence": 0.985
              },
              {
                "boundingBox": [
                  182,
                  194,
                  224,
                  194,
                  224,
                  209,
                  182,
                  208
                ],
                "text": "Setup",
                "confidence": 0.985
              }
            ]
          },
          {
            "boundingBox": [
              276,
              188,
              340,
              192,
              339,
              211,
              275,
              209
            ],
            "text": "cloud",
            "words": [
              {
                "boundingBox": [
                  297,
                  192,
                  339,
                  194,
                  339,
                  211,
                  297,
                  211
                ],
                "text": "cloud",
                "confidence": 0.933
              }
            ]
          },
          {
            "boundingBox": [
              376,
              187,
              461,
              191,
              460,
              212,
              376,
              211
            ],
            "text": "Location",
            "words": [
              {
                "boundingBox": [
                  394,
                  191,
                  460,
                  196,
                  459,
                  211,
                  394,
                  211
                ],
                "text": "Location",
                "confidence": 0.844
              }
            ]
          },
          {
            "boundingBox": [
              500,
              189,
              666,
              192,
              665,
              212,
              499,
              211
            ],
            "text": "LIcentral (multi-location)",
            "words": [
              {
                "boundingBox": [
                  501,
                  190,
                  567,
                  195,
                  567,
                  212,
                  500,
                  212
                ],
                "text": "LIcentral",
                "confidence": 0.665
              },
              {
                "boundingBox": [
                  572,
                  195,
                  665,
                  195,
                  665,
                  212,
                  571,
                  212
                ],
                "text": "(multi-location)",
                "confidence": 0.899
              }
            ]
          },
          {
            "boundingBox": [
              21,
              238,
              224,
              238,
              223,
              255,
              21,
              253
            ],
            "text": "Number of Locations Enrolling",
            "words": [
              {
                "boundingBox": [
                  21,
                  239,
                  76,
                  239,
                  76,
                  253,
                  21,
                  253
                ],
                "text": "Number",
                "confidence": 0.985
              },
              {
                "boundingBox": [
                  79,
                  239,
                  92,
                  239,
                  92,
                  253,
                  79,
                  253
                ],
                "text": "of",
                "confidence": 0.983
              },
              {
                "boundingBox": [
                  95,
                  239,
                  161,
                  239,
                  161,
                  254,
                  95,
                  253
                ],
                "text": "Locations",
                "confidence": 0.981
              },
              {
                "boundingBox": [
                  164,
                  239,
                  224,
                  239,
                  223,
                  256,
                  163,
                  254
                ],
                "text": "Enrolling",
                "confidence": 0.983
              }
            ]
          },
          {
            "boundingBox": [
              273,
              237,
              289,
              239,
              288,
              257,
              272,
              255
            ],
            "text": "1",
            "words": [
              {
                "boundingBox": [
                  278,
                  237,
                  290,
                  239,
                  287,
                  257,
                  276,
                  255
                ],
                "text": "1",
                "confidence": 0.981
              }
            ]
          },
          {
            "boundingBox": [
              337,
              239,
              670,
              239,
              670,
              253,
              337,
              252
            ],
            "text": "*If more than 1 location, add info on the locations form",
            "words": [
              {
                "boundingBox": [
                  338,
                  239,
                  347,
                  239,
                  347,
                  252,
                  338,
                  252
                ],
                "text": "*If",
                "confidence": 0.874
              },
              {
                "boundingBox": [
                  350,
                  239,
                  384,
                  239,
                  384,
                  253,
                  350,
                  252
                ],
                "text": "more",
                "confidence": 0.983
              },
              {
                "boundingBox": [
                  386,
                  239,
                  416,
                  239,
                  416,
                  253,
                  386,
                  253
                ],
                "text": "than",
                "confidence": 0.986
              },
              {
                "boundingBox": [
                  419,
                  239,
                  422,
                  239,
                  422,
                  253,
                  419,
                  253
                ],
                "text": "1",
                "confidence": 0.635
              },
              {
                "boundingBox": [
                  425,
                  239,
                  478,
                  239,
                  478,
                  253,
                  425,
                  253
                ],
                "text": "location,",
                "confidence": 0.955
              },
              {
                "boundingBox": [
                  481,
                  239,
                  506,
                  239,
                  506,
                  253,
                  481,
                  253
                ],
                "text": "add",
                "confidence": 0.986
              },
              {
                "boundingBox": [
                  509,
                  239,
                  533,
                  239,
                  533,
                  253,
                  509,
                  253
                ],
                "text": "info",
                "confidence": 0.981
              },
              {
                "boundingBox": [
                  535,
                  239,
                  551,
                  239,
                  552,
                  253,
                  535,
                  253
                ],
                "text": "on",
                "confidence": 0.988
              },
              {
                "boundingBox": [
                  554,
                  239,
                  574,
                  239,
                  575,
                  253,
                  554,
                  253
                ],
                "text": "the",
                "confidence": 0.987
              },
              {
                "boundingBox": [
                  577,
                  239,
                  634,
                  239,
                  634,
                  253,
                  577,
                  253
                ],
                "text": "locations",
                "confidence": 0.973
              },
              {
                "boundingBox": [
                  636,
                  239,
                  666,
                  240,
                  666,
                  253,
                  637,
                  253
                ],
                "text": "form",
                "confidence": 0.986
              }
            ]
          }
        ]
      }
    ]
  }
}

screenshot of csv after adding spproach 2 of Supun De Silva answer: enter image description here

Cinchoo
  • 6,088
  • 2
  • 19
  • 34
Cataster
  • 3,081
  • 5
  • 32
  • 79
  • make sure you use latest code from my previous answer, otherwise calling Dispose twice will result in writing to a file twice – Yehor Androsov Nov 12 '20 at 07:29
  • 2
    A CSV file is a text file, not a database. You can't insert/modify rows as if it were a database, you can only overwrite the file or append text to it. The code needs a bit of cleaning too - anything that needs disposing should be declared in a `using` block, or in C# 8, with `using var` – Panagiotis Kanavos Nov 12 '20 at 07:30
  • @PanagiotisKanavos ohhh...i probably will end up inserting into a database at some point...but thats good to know – Cataster Nov 12 '20 at 07:31
  • @PanagiotisKanavos it is just we need conditional building of writer, `using` block won't allow to do that. should we declare another local variable inside of `using` instead? – Yehor Androsov Nov 12 '20 at 07:37
  • The `using` block does allow this and your code doesn't build the writer based on any condition anyway. It uses a fluent API. Nothing prevents you from writing `using(var writer=new ...). On the other hand, if the fluent API throws before the final `With` the generated writer won't be disposed because it was never assigned to the variable. – Panagiotis Kanavos Nov 12 '20 at 07:42
  • @PanagiotisKanavos thanks, I confused Fluent API with building – Yehor Androsov Nov 12 '20 at 07:47
  • 4
    why don't you simple delete the old file before starting your program (or adding text to the csv)? – SaschaP Nov 12 '20 at 09:08
  • Tending towards answer by @SaschaP. **How heavy is it to re-create the csv file ?** (you can create a backup of the old if necessary) – Supun De Silva Nov 12 '20 at 09:11
  • @SaschaP thats definitely an option I considered, and this is the starting point I have: Switch between create and append options `bool someCondition = true; FileMode fileMode = someCondition ? FileMode.Append : FileMode.Create; using (var fs = new FileStream(@"D:\file.csv", fileMode, FileAccess.Write))` however idk how to make the condition switch dynamically based on session. I've done some research but all I'm getting is cookie/HTTP sessions..I want something like as you say, everytime the program starts the condition sets the mode to create the file then immediately switch it to append. – Cataster Nov 12 '20 at 15:30
  • @SupunDeSilva please see my comment above – Cataster Nov 12 '20 at 15:31
  • @Cataster just quick question, do you have a json sample that I can feed in and are you certain about `inputFile` and `fieldValue(lines, "Practice Name")` (**does not compile**) Add the missing bits to the question. I will have a look at it. – Supun De Silva Nov 13 '20 at 11:25
  • @SupunDeSilva sure, please see my editted post. i added the fieldValue function as well as a sample json file – Cataster Nov 13 '20 at 15:01
  • @Cataster I don't get it, why you need the switch between create and append. If you use `FileMode.Create` the file is created if it does not exist, or it is truncated if it already exists. In either way you are writing to an empty text-file – SaschaP Nov 13 '20 at 16:19
  • @SaschaP because `FileMode.Create` creates a new csv file every iteration (foreach in main()). so this means the csv file will only have 1 record for the last file in the last iteration. if FileMode.Create worked in a way that it ONLY creates a new file everytime the program runs (new console), then that wouldve been perfect, but it doesnt so thats why i have to write in some sort of condition if its a new run (new console/new session) to switch it back and forth between create and append – Cataster Nov 13 '20 at 16:24
  • @Cataster I see. So then you could for example create an empty csv file before running your `foreach` in `Main` and then simply append text to it. How about this? – SaschaP Nov 13 '20 at 16:30
  • @SaschaP thats a good idea, but feels like i'd be replicating the File creation code. I was hoping to leverage the advantage that FileStream has that it automatically creates the csv file if it doesnt exist. but i can define another Filestream in main() i guess – Cataster Nov 13 '20 at 16:35
  • @Cataster there are many options you have and I'm sure every reader of this thread would write your code different to get the same results :-) If you don't want to duplicate the FileStream-thing, than set a flag in your `Main` and use it to determine if the file has to be created or all text has to be appended. – SaschaP Nov 13 '20 at 16:39
  • @SaschaP so just a quick question regarding session. This may be something i need to know about later on anyways. what is the way to manipulate/track sessions in c#? basically anytime I hit run, and a new console pops up, is that referred to as session? because when i researched a bit the only c# related sessions im getting is asp.net cookie/http sessions, which is not what im looking for – Cataster Nov 13 '20 at 16:43
  • @Cataster do you mean session in term of the execution of your program, from start (console pops up) till end (console window closes)? If so, I don't know if there is a decent way to track how often your program has been executed. In my world I would use a log-file to track whatever I want (# of executions, run-time, etc...) – SaschaP Nov 13 '20 at 16:47
  • @SaschaP yep basically anytime the program executes, the execution :) i refer to it as session because in powershell (c#'s .NET sibling) you can initiate a PSSession, so i thought an execution is referred to as session nd started calling it session from then on – Cataster Nov 13 '20 at 16:49

1 Answers1

2

FYI. Sample file you provided does not work as it fails at var lines = (dynamic[])r1.lines;

Approach 1 - Rename Old File and Create a new one for data appending

1. Introduce new Function

 private static void RenameIfExist(string csvFilePath)
 {
    if (File.Exists(csvFilePath))
    {
        System.IO.File.Move(csvFilePath, $"{csvFilePath}_{DateTime.Now.ToString("backup_yyyyMMdd_HHmmss")}");
    }
}

2. Call the Mover Function and use Create mode in new file

public static void JsonToCsv(string jsonInputFile, string csvFile)
{
    using (var p = new ChoJSONReader(jsonInputFile).WithJSONPath("$..readResults"))
    {
        Program.RenameIfExist(csvFile);

        using (var fs = new FileStream(csvFile, FileMode.Create, FileAccess.Write))
        {
            try
            {
                using (ChoCSVWriter<dynamic>  writer = new ChoCSVWriter(fs)
                    .WithField("FileName", fieldName: "File Name")
                    .WithField("Page")
                    .WithField("PracticeName", fieldName: "Practice Name")
                    .WithFirstLineHeader())
                {
                    // Limit the result to page 1 since the fields below only exist on the 1st page
                    writer.Write(p
                        .Where(r1 => r1.page == 1)
                        .Select(r1 =>
                        {
                            var lines = (dynamic[])r1.lines;
                            return new
                            {
                                FileName = jsonInputFile,
                                Page = r1.page,
                                PracticeName = fieldValue(lines, "Practice Name"),
                            };
                        }
                    ));
                }

            }
            catch(Exception e)
            {
                throw e;
            }
        }
    }
}

Approach 2 - Open Existing File and create a lookup structure with the data

  • You may need to tweak this a bit

1. Declare new struct to store a key

private static Dictionary<string, bool> processedfileStates = new Dictionary<string, bool>();

2. Pre-Loader Function

private static void LoadOldStatsIfExist(string csvFilePath)
{
    if (File.Exists(csvFilePath))
    {
        using (var fs = new FileStream(csvFilePath, FileMode.Open, FileAccess.Read))
        {
            using (ChoCSVReader<dynamic> reader = new ChoCSVReader(fs).WithFirstLineHeader())
            {
                using (var dataReader = reader.AsDataReader())
                {
                    while (dataReader.Read())
                    {
                        Program.processedfileStates.Add($"{dataReader[0].ToString()}_{dataReader[1].ToString()}_{dataReader[2].ToString()}", true);
                    }
                }
            }
        }
            
    }
}

3. Json to CSV fcn

public static void JsonToCsv(string jsonInputFile, string csvFile)
    {
        using (var p = new ChoJSONReader(jsonInputFile).WithJSONPath("$..readResults"))
        {
            Program.LoadOldStatsIfExist(csvFile);

            using (var fs = new FileStream(csvFile, Program.processedfileStates.Count == 0 ? FileMode.Create : FileMode.Append, FileAccess.Write))
            {
                if (Program.processedfileStates.Count != 0)
                {
                    fs.Write(Environment.NewLine);
                }
                
                try
                {
                    ChoCSVWriter<dynamic> writer = new ChoCSVWriter(fs);
                    if (Program.processedfileStates.Count == 0)
                    {
                        writer.WithFirstLineHeader();
                    }

                    using (writer
                        .WithField("FileName", fieldName: "File Name")
                        .WithField("Page")
                        .WithField("PracticeName", fieldName: "Practice Name")
                        )
                    {

                        if (Program.processedfileStates.Count == 0)
                        {
                            writer = writer.WithFirstLineHeader();
                        }

                        // Limit the result to page 1 since the fields below only exist on the 1st page
                        var data = p
                            .Where(r1 => r1.page == 1)

                            .Select(r1 =>
                            {
                                var lines = (dynamic[])r1.lines;
                                return new
                                {
                                    FileName = jsonInputFile,
                                    Page = r1.page,
                                    PracticeName = fieldValue(lines, "Practice Name"),
                                };
                            }
                        ).Where(de => !processedfileStates.ContainsKey($"{de.FileName.ToString()}_{de.Page.ToString()}_{de.PracticeName.ToString()}"));

                        writer.Write(data);
                    }

                }
                catch (Exception e)
                {
                    throw e;
                }
            }
        }
    }

Approach 2 - Refactored

 public class OPModel
{
    public string FileName { get; set; }
    public long Page { get; set; }
    public string PracticeName { get; set; }
}

public class Program
{
    const string jsonFilesPath = "D:\\DevWork\\C#\\TempProject1\\ConsoleApp1\\data";
    const string csvFilePath = "D:\\DevWork\\C#\\TempProject1\\ConsoleApp1\\output\\op.csv";

    private static Dictionary<string, bool> processedfileStates = new Dictionary<string, bool>();
    private static bool fileExisted = false;
    private static void RenameIfExist(string csvFilePath)
    {
        if (File.Exists(csvFilePath))
        {
            System.IO.File.Move(csvFilePath, $"{csvFilePath}_{DateTime.Now.ToString("backup_yyyyMMdd_HHmmss")}");
        }
    }
    private static void LoadOldStatsIfExist(string csvFilePath)
    {
        if (File.Exists(csvFilePath))
        {
            using (var fs = new FileStream(csvFilePath, FileMode.Open, FileAccess.Read))
            {
                using (ChoCSVReader<dynamic> reader = new ChoCSVReader(fs).WithFirstLineHeader())
                {
                    using (var dTable = reader.AsDataTable())
                    {
                        foreach (DataRow row in dTable.Rows)
                        {
                            Program.processedfileStates.Add($"{row["File Name"].ToString()}_{row["Page"].ToString()}_{row["Practice Name"].ToString()}", true);
                        }
                    }
                }
            }
        }
    }
    public static void Main(string[] args)
    {
        try
        {
            Program.fileExisted = File.Exists(csvFilePath);
            Program.LoadOldStatsIfExist(csvFilePath);

            List<OPModel> dataToWrite = new List<OPModel>();
            // Persist each file to
            foreach (var jsonFile in Directory.GetFiles(jsonFilesPath))
            {
                dataToWrite.AddRange(JsonToCsv(jsonFile));
            }

            if (dataToWrite.Count != 0)
            {
                using (var fs = new FileStream(csvFilePath, !Program.fileExisted ? FileMode.Create : FileMode.Append, FileAccess.Write))
                {
                    try
                    {
                        ChoCSVWriter<OPModel> writer = new ChoCSVWriter<OPModel>(fs);
                        using (writer.WithField("FileName", fieldName: "File Name").WithField("Page").WithField("PracticeName", fieldName: "Practice Name"))
                        {
                            if (!Program.fileExisted)
                            {
                                writer = writer.WithFirstLineHeader();
                            }

                            writer.Write(dataToWrite);
                        }

                        fs.Write(Environment.NewLine);

                    }
                    catch (Exception e)
                    {
                        throw e;
                    }
                }
            }
            

            //Output to CSV
            
            Console.ReadKey();
        }
        catch (Exception ex)
        {

        }
    }
    public static string fieldValue(IEnumerable<dynamic> lines, string nameOfField, bool throwException = false)
    {
        var skipped = lines.SkipWhile(l => l.text != nameOfField);

        switch (throwException)
        {
            case true:
                var enumerator = lines.GetEnumerator();

                while (enumerator.MoveNext())
                {
                    if (enumerator.MoveNext())
                        return skipped.Skip(1).First().text;
                    else
                        throw new InvalidDataException("Odd number of items found in IEnumerable<>");
                }

                break;
            case false:
                // Skip(#) to skip over the unnecessary Lines, 
                // such as "Account Information", preceding "Practice Name".
                return skipped.Skip(1).First().text;
            default:
                Console.WriteLine("Default case");
                break;
        }
        // Returning null isn't recommended, but it does fix the error "not all code paths return a value"
        return null;
    }


    public static List<OPModel> JsonToCsv(string jsonInputFile)
    {
        using (var reader = new ChoJSONReader(jsonInputFile).WithJSONPath("$..readResults"))
        {
            var data = reader.Where(r1 => r1.page == 1)
                            .Select(r1 =>
                            {
                                var lines = (dynamic[])r1.lines;
                                return new OPModel
                                {
                                    FileName = jsonInputFile,
                                    Page = r1.page,
                                    PracticeName = Program.fieldValue(lines, "Practice Name")
                                };
                            }
                        ).Where(de => !processedfileStates.ContainsKey($"{de.FileName.ToString()}_{de.Page.ToString()}_{de.PracticeName.ToString()}")).ToList();

            return data;
        }
    }
}
Supun De Silva
  • 1,437
  • 9
  • 15
  • Hmm, let me see why it's not working. I made sure to run it through prior to putting it here. Maybe I posted a different file? Approach 1 is good for backups for sure, but if I'm running this job ebery5 mins or so, it can create a ton of unnecessary files. approach2 sounds interesting. I'm definitely interested about it considering panagiotis implied its something done on databases only not files. – Cataster Nov 14 '20 at 22:03
  • I'll get you a snippet in a few hours time. – Supun De Silva Nov 14 '20 at 22:12
  • sounds good. btw, I updated the code a bit (though i dont think it makes a difference) and i created a new .NET project to test what i posted here and it worked just fine. are you importing namespaces required, like choTL, Linq, etc? i also included them in the code – Cataster Nov 15 '20 at 00:05
  • yes. I have installed nugets `ChoETL` and `ChoETL.JSON` and imports are as follows `using ChoETL; using System; using System.Collections.Generic; using System.IO; using System.Linq;` – Supun De Silva Nov 15 '20 at 02:00
  • Added the second method as well. See if that works for you. Keep an eye at the `catch (Exception e) { throw e; }` – Supun De Silva Nov 15 '20 at 02:24
  • just tested this out. I deleted the existing csv file to test this out fresh. Here's whats happening: including just one file in the directory to process, e.g. Sample.json, i ran the program and as expected it created the csv file, with the same output like this: Headers: `File Name Page Practice Name` Data: `Sample.json 1 Some Practice Name` i ran the program again and nothing was changed, nor was appended again! so that test case (no appending if same file with no change) is success. – Cataster Nov 15 '20 at 04:13
  • continued...Now i changed `Some Practice Name` in the JSON sample to `Some Pra Name` and the headers remained the same as expected, however, this is what the 1st Data row showed: `Sample.json 1 Some Practice NameFile Name Page Practice Name` and the 2nd data row showed: `Sample.json 1 Some Pra Name`. Ive also pasted a screenshot in my post of the csv – Cataster Nov 15 '20 at 04:13
  • What shouldve happened is the same data row shouldve just modified the ”Some Practice Name to ”Some Pra Name” – Cataster Nov 15 '20 at 04:29
  • Did a little tweak, have a look. I am failing to execute the code as the sample json is causing a crash. I missed the `newline char` and `do not append headers` in the append mode. – Supun De Silva Nov 15 '20 at 04:30
  • Note that if you spend a bit of time this can be done nicer I presume. – Supun De Silva Nov 15 '20 at 04:31
  • ok so i updated the method with your changes (please see updated screenshot in my post of current csv state) and now its not appending the header to the data row which is good, however, for some reason, if i run the program again, an empty row is appended, as you can see from my updated screenshot. and if i make changes, it doesnt update the same json file data row (Sample.json), it just appends a new row with the change. – Cataster Nov 15 '20 at 04:48
  • try commenting `if (Program.processedfileStates.Count != 0) { fs.Write(Environment.NewLine); }` – Supun De Silva Nov 15 '20 at 05:15
  • does the json file in example not create an exception for you ? If you have another example json, it would be handy – Supun De Silva Nov 15 '20 at 05:30
  • nope it doesnt. wait what version of choETL did you install? ChoETL.JSON -Version 1.2.1.2? because Version 1.2.1.2 required for .WithField()....not sure if thats why you may be getting the issues. I ran into this kinda problem before so what i did is i uninstalled choETL Version 1.2.1.1 and reinstalled like so: `Install-Package ChoETL.JSON -Version 1.2.1.2` through nuget package manager – Cataster Nov 15 '20 at 05:35
  • I refactored the code a bit. Check in 2 mins. **Approach 2 - Refactored** (The problem was the package version) – Supun De Silva Nov 15 '20 at 08:13
  • This is the output im now getting using the refactored code (deleted existing csv to test fresh): Headers: `File Name Page Practice Name`. Data: `Sample.json 1 Sample.json`. also i noticed when i run it the first time it just keeps on running for a while. before the refactored code the runtime finishes in less than 3 seconds (with just sample.json file in directory). If i change practice name in the json file, it runs quick afterwards as expected, but still output in csv doesnt change (practice name value showing sample.json file name instead). glad you were able to get the exception fixed :) – Cataster Nov 15 '20 at 16:33
  • It keeps running because there is a `Console.ReadKey()` at the end to keep the console open. Remove that. This does not do `Replace` it simply does not write if the record exists. record is identified by `{File Name}_{Page}_{Practice Name}'. say you had file 1 with record before as "data1.json_1_pracOne` if you change that file's `Practice Name` to `PraticeOne`, it will create a new record at the bottom. – Supun De Silva Nov 15 '20 at 20:33
  • hey sry my power was cut off from the wind. so i just tested again with rafactored code today and for practice name its printing out the file name instead...`Headers: File Name Page Practice Name`. Data: `Sample.json 1 Sample.json`. The Data should be like: `Sample.json 1 Some Pra Name`. the nonrefactored code printed out the practice name just fine – Cataster Nov 17 '20 at 05:40
  • Try using that entire code block in a new project. Seems to write fine in my case. – Supun De Silva Nov 17 '20 at 09:24
  • created a new project and also copied the same JSON file i have in the post. still outputting `Sample.json` for the `Practice Name` field...i tried the non-refactored code as well and its outputting the practice name as expected. could it be the DataRow? i had to import `using System.Data;` is that correct? p.s. did you know that if you add a `@` literal to the path, you dont have to escape the path? so instead of `D:\\...\\...\\...` you can say `@"D:\..\...\"`. thought that was a neat trick in c# :) – Cataster Nov 17 '20 at 15:04