0

Im having problem getting creating the last file.

I have a tab delimited text file that looks like this.

KABEL   Provkanna for Windchill_NWF-TSNM    =2212.U001+++-X2    PXC.2400016             =2271.U004+++-X1    Test_Created_in_WT              =2212-W123  RXF 4x25    0000000440  Cable RXF 4x25
PART        01      1   1       
PART        02      2   2       
PART        03      3   3       
PART        04      4   4       
PART        SH      GND GND     
KABEL   Provkanna for Windchill_NWF-TSNM    =2212.U001+++-X2    PXC.2400016             =2271.U004+++-X1    Test_Created_in_WT              =2212-W124  RXF 4x35    0000000456  Cable RXF 4x35
PART        01  1   5   5       
PART        02  1   6   6       
PART        03  1   7   7       
PART        04  1   8   8       
PART        SH  1   GND GND     
KABEL   Provkanna for Windchill_NWF-TSNM    =2212.U001+++-X2    PXC.2400016             =2271.U004+++-X1    Test_Created_in_WT              =2212-W125  RXF 4x35    0000000456  Cable RXF 4x35
PART        01  1   9   9       
PART        02  1   10  10      
PART        03  1   11  11      
PART        04  1   12  12      
PART        SH  1   GND GND     

Basically it is a row starting with the Word KABEL followed by a number of tab delimited columns. This line is then followed by some lines starting with the word PART. The number of lines starting with PART can differ.

Now I want this file to be broken down into several files.

Every parsed file shall have a name containing information from a certain column of the line starting with KABEL. In that file every line following starting with PART shall be added.

Then when a line staring with KABEL shows up again a new file will be created and the PART-lines shall be added to that file... and so on ... and so on.

I have tried a lot back and forth and finaly found a way to create the first two files correctly... but... the last file wont be created.

My script reads and finds and displays the correct column of what is supposed to be the unique part of the last parsed outputfile but I dont see any file being output.

Any takers? I will very much appriciate your help since Im stuck...

{
    string line ="";
    string ColumnValue ="";
    string Starttext = "PART";
    string Kabeltext = "KABEL";
    int column = 16;     
    string FilenameWithoutCabelNumber = @"C:\Users\tsnm2171\Desktop\processed\LABB\OUTPUT - Provkanna for Windchill_NWF-TSNM_2212_CABLE_CONNECTION";
    string ExportfileIncCablenumber ="";
    string filecontent ="";

    using (System.IO.StreamReader reader = new System.IO.StreamReader(@"C:\Users\tsnm2171\Desktop\processed\LABB\Provkanna for Windchill_NWF-TSNM_2212_CABLE_CONNECTION.txt"))          
    {       
        line = reader.ReadLine();

        //Set columninnehåll till filnamn (String ColumnValue)   
        string [] words = line.Split();
        ColumnValue = words[column];

        MessageBox.Show (ColumnValue);

        while (line != null)                        
        {   
            line = reader.ReadLine();

            if (line.StartsWith(Kabeltext)) // if line starts with KABEL 
            {   
                ExportfileIncCablenumber =  (FilenameWithoutCabelNumber + "-" + ColumnValue + ".txt");
                System.IO.File.WriteAllText(ExportfileIncCablenumber, filecontent);

                filecontent = string.Empty;
                string [] words2 = line.Split();
                ColumnValue = words2[column];

                MessageBox.Show("Ny fil " + ColumnValue);
            }
            else if (line.StartsWith(Starttext)) // if line starts with PART
            {
                filecontent += ((line)+"\n");           //writes the active line                                
            }                   
        }
        ExportfileIncCablenumber =  (FilenameWithoutCabelNumber + "-" + ColumnValue + ".txt");
        System.IO.File.WriteAllText(ExportfileIncCablenumber, filecontent);                     filecontent = "";                                                                   
    }
}

Thanks in advance

Tomas

maccettura
  • 10,514
  • 3
  • 28
  • 35
  • That's *not* a tab-delimited file. That's a file containing complex records. You need to write a parser that understands when each record starts and how to handle each line. You can't do that in a single loop. You should write functions/classes that can recognize each type of line, eg Header if it starts with KABEL, PART if it starts with PART. It's a lot easier for each function to recognize its own fields after that, for example PART only has to check 3 fields – Panagiotis Kanavos Sep 26 '17 at 16:45
  • BTW there are tools that allow you to create parsers like ANTLR or FParsec. Instead of writing a "recognizer" for each type of record, you use syntax rules. – Panagiotis Kanavos Sep 26 '17 at 16:46

1 Answers1

0

First of all, you should be doing reading lines and null checking mode like this while((line = reader.ReadLine()) != ) because it protects you from null reference. My version, that seems to work:

{
        const string StartText                  = "PART";
        const string KabelText                  = "KABEL";  
        const string FilenameWithoutCabelNumber = @"...\";

        string fileContent = "";
        int    fileNumber  = 0;

        using (StreamReader reader = File.OpenText(@"...\file.txt"))
        {       
            string line = reader.ReadLine();
            string columnValue = GetParticularColumnName(line);
            //Set columninnehåll till filnamn (String ColumnValue)   
            MessageBox.Show (ColumnValue);

            var ExportfileIncCablenumber ="";
            while ((line = reader.ReadLine()) != null)         
            {   
                if (line.StartsWith(KabelText)) // if line starts with KABEL 
                {   
                    ExportfileIncCablenumber =  $"{FilenameWithoutCabelNumber}-{columnValue}({fileNumber}).txt";

                    File.WriteAllText(ExportfileIncCablenumber, fileContent);

                    fileContent = string.Empty;
                    columnValue = GetParticularColumnName(line);
                    fileNumber++;
                }
                else if (line.StartsWith(StartText)) // if line starts with PART
                {
                    fileContent += ((line)+Environment.NewLine);    //writes the active line                                
                }                   
            }

            ExportfileIncCablenumber =  (FilenameWithoutCabelNumber + "-" + columnValue + ".txt");
            File.WriteAllText(ExportfileIncCablenumber, fileContent);
        }
    }

    private static string GetParticularColumnName(string line)
    {
        return line.Split(' ').Last();
    }

Problem you encountered with saving files was because of misunderstanding of how String.Split() works. See docs for details, but to make it short:

If the separator argument is null or contains no characters, the method treats white-space characters as the delimiters.

That's why you had an array with words and empty strings. column was selecting empty string, and that's why you had one file overwriting another. (column value of 16 was also wrong, there were actually 15 words). All your lines were concatenated, because windows doesn't treat '\n' as end-line character, that's why I'm using Environment.NewLine The last, but not the least problem, is your code style. REALLY, you should adhere to common coding conventions for .Net, because that would make your code coherent and more readable.

Brat Wiekszy
  • 31
  • 1
  • 4
  • Coding conventions won't help the OP write a parser. – Panagiotis Kanavos Sep 26 '17 at 16:48
  • I think it's reasonable to point it out, while OP is still learning, before he actually learns bad habits and propagates them in his mature code. If you see any flaw in my explaination, please share it. I think it's clear that OP should look at what the code does and check documentation if he's not sure about the outcome. – Brat Wiekszy Sep 26 '17 at 16:59
  • Then I suggest you check up on parsers while still learning. You'll realize that *this* answer isn't an answer. Experienced devs don't read lines one at a time either, they use `ReadLines` that returns an `IEnumerable`. *This* answer focuses on trivial mistakes instead of the actual, far more challenging problem – Panagiotis Kanavos Sep 27 '17 at 07:23
  • PS people seldom use `Environment.NewLine` either, no matter what MSDN docs said 15 years ago. Who said *your* environment's newline is acceptable for the *consumer* of the file? Besides, IO classes and methods handle a single `\n` just fine. Experienced devs *don't* append or split strings either. This generates unnecessary temporary strings. They use StreamWriter or StringBuilder and write *lines* if they have to. They use regular expressions to parse lines and *avoid* wasting CPU and RAM with multiple splitting – Panagiotis Kanavos Sep 27 '17 at 07:27