0

I am writing into a .csv file from multiple dictionaries. I want to limit the max size of this output file by 10,000 rows and create new files to print next lines. The following code is writing to the file OutputFile. If say, the output file has 60,000 rows, then instead of printing 60,000 rows in a single file, I want 10,000 rows to be printed in one file and rest of the lines in other files. So, I Will have 6 outputFiles with 10K rows in each file.

using (StreamWriter writer = new StreamWriter(OutputPath + "OutputFile-" + DateTime.Now.ToString("MM.dd.yy_hh.mm.ss.tt") + ".csv"))
{
    foreach (var program in _Code1)     
    {
        string Id = program.ToString();
        foreach (var entry in _List1)
        {
            if (!string.IsNullOrEmpty(entry.Value.ToString().Trim()))
            {
                string LineValue = entry.Value.ToString();
                string[] SplitedLine = LineValue.Split(',');

                string query = "\"Insert into table(Id,UserID) values('" + Id + "','" + SplitedLine[1] + "')\"";
                writer.WriteLine(query);
            }
        }
    }

    foreach (var program in _Code2)   
    {
        string Id = program.ToString();
        foreach (var entry in _List2)
        {
            if (!string.IsNullOrEmpty(entry.Value.ToString().Trim()))
            {
                string LineValue = entry.Value.ToString();
                string[] SplitedLine = LineValue.Split(',');

                string query = "\"Insert into table(Id,UserID) values('" + Id + "','" + SplitedLine[2] + "')\"";
                writer.WriteLine(query);
            }
        }
    }

    foreach (var program in _Code1)  
    {
        string Id = program.ToString();
        foreach (var entry in _List3)
        {
            if (!string.IsNullOrEmpty(entry.Value.ToString().Trim()))
            {
                string LineValue = entry.Value.ToString();
                string[] SplitedLine = LineValue.Split(',');

                string query = "Delete From table Where ID='" + Id + "' And UserID='" + SplitedLine[1] + "'";
                writer.WriteLine(query);
            }
        }
    }
}

I have looked into this Split large file into smaller files by number of lines in C#? But the solution here is writing from one file so it can count the number of rows but in my case I am writing from multiple dictionaries so how will I keep a tab of the max limit of the output file ?

Magnetron
  • 7,495
  • 1
  • 25
  • 41
Anonymous
  • 113
  • 1
  • 2
  • 13
  • You can count the lines and write to another file once you reach the limit – Panagiotis Kanavos Feb 24 '23 at 11:31
  • What are you trying to do though? This isn't a CSV file, it's a SQL script. Why are you writing `Insert into table(Id,UserID) values(`? to files? If you want to load data into a database, such a script is the slowest possible way to do it. All databases have a way to bulk import data from flat files or CSV files – Panagiotis Kanavos Feb 24 '23 at 11:32
  • 60K rows is almost no data. If you have size or performance issues, it's because os such INSERTs. Right now this code is wasting a lot of bytes repeating the same string, `Insert into table(Id,UserID) values(`. Importing is slow because this script has to execute 60K INSERTs one by one instead of using any of the fast import methods – Panagiotis Kanavos Feb 24 '23 at 11:36
  • Also, if this was SQL, which it actually looks like, a single INSERT can list any number of records, not just one. – 500 - Internal Server Error Feb 24 '23 at 12:03
  • @PanagiotisKanavos This is a requirement. To print these queries in .csv and in this way only. This is not something that I can change. and 10k lines per file is also a requirement. Not something that I can change. – Anonymous Feb 24 '23 at 13:58
  • @500-InternalServerError This is a requirement. To print these inserts like this way only in .csv files. This is not something that I can change. – Anonymous Feb 24 '23 at 13:59
  • It's a bit difficult for us to interpret what your requirements are since a CSV is not the same as a SQL script. – 500 - Internal Server Error Feb 24 '23 at 14:18
  • @500-InternalServerError All I want is to split the output file that I am writing in the above code into multiple .csv if the number of rows exceeds 10k. For eg, for a 60k output lines, I should have 6 output file each of 10k. Now, my issue is that I have multiple dictionaries from which I am reading data. So, I am confused how should I analyze the number of rows and create new files for new lines. – Anonymous Feb 24 '23 at 14:30
  • This isn't a CSV, it's a SQL script. Please don't keep calling it a CSV and don't repeat the same wrong question. You're trying to create a SQL script. You can count the lines you generated in that loop and create a new file when you reach a specific limit. Have you tried that? – Panagiotis Kanavos Feb 24 '23 at 14:35
  • CSV means Comma Separated Values. It's a text file containing Values Separated by Commas, eg `Banana,123,2023-02-24`. Not INSERT commands. – Panagiotis Kanavos Feb 24 '23 at 14:37
  • @PanagiotisKanavos I know what a .csv file is. I am printing these queries in the same comma separated format. Why are you focusing on the data. That's not in my hand. – Anonymous Feb 24 '23 at 14:52
  • @500-InternalServerError Please check my answer below and let me know if there are better ways to solve this issue. – Anonymous Feb 24 '23 at 15:09

1 Answers1

0

This is how I fixed my issue. I counted every line after every insert in the above code. Then I checked if those lines were greater than 10k, if yes, then I opened the file that I had just written, read it, split its rows into multiple files, and deleted the original file. This is how I have done this. If there is any better than to achieve this then please let me know.

 if(ProgramMemberFileCount>10000)
                {
                    string path = OutputPath + filename + ".csv";
                    using (System.IO.StreamReader sr = new System.IO.StreamReader(path))
                    {
                        int fileNumber = 0;

                        while (!sr.EndOfStream)
                        {
                            int count = 0;
                            fileNumber++;
                            using (System.IO.StreamWriter sw = new System.IO.StreamWriter( OutputPath + fileNumber + filename + ".csv"))
                            {
                                sw.AutoFlush = true;

                                while (!sr.EndOfStream && ++count <= 10000)
                                {
                                    sw.WriteLine(sr.ReadLine());
                                }
                            }
                        }
                       
                    }

                    //System.GC.Collect();
                    //System.GC.WaitForPendingFinalizers();
                    File.Delete(path);
                }

Anonymous
  • 113
  • 1
  • 2
  • 13