1

I have implemented the solution which Cuong suggested here: C# Processing Fixed Width Files

I have also made it go through a folder and apply that to all the .txt files in that folder.

All that works fine, but for some of the .txt files it fails on the var csvLines with the following error:

{"Index and length must refer to a location within the string.\r\nParameter name: length"}

A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll
System.ArgumentOutOfRangeException: Index and length must refer to a location within the string.
Parameter name: length
   at System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy)
   at System.String.Substring(Int32 startIndex, Int32 length)
   at FixedWidthFiles.Main.<>c__DisplayClass11.<>c__DisplayClass13.<buttonProcessAllFiles_Click>b__d(KeyValuePair`2 pair) in \\GBMACCMPFS11\Shhk$\Visual Studio 2010\Projects\FixedWidthFiles\FixedWidthFiles\Main.cs:line 138
   at System.Linq.Enumerable.WhereSelectListIterator`2.MoveNext()
   at System.String.Join(String separator, IEnumerable`1 values)
   at FixedWidthFiles.Main.<>c__DisplayClass11.<buttonProcessAllFiles_Click>b__c(String line) in \\GBMACCMPFS11\Shhk$\Visual Studio 2010\Projects\FixedWidthFiles\FixedWidthFiles\Main.cs:line 137
   at System.Linq.Enumerable.WhereSelectArrayIterator`2.MoveNext()
   at System.IO.File.InternalWriteAllLines(TextWriter writer, IEnumerable`1 contents)
   at System.IO.File.WriteAllLines(String path, IEnumerable`1 contents)
   at FixedWidthFiles.Main.buttonProcessAllFiles_Click(Object sender, EventArgs e) in \\GBMACCMPFS11\Shhk$\Visual Studio 2010\Projects\FixedWidthFiles\FixedWidthFiles\Main.cs:line 140

Any idea what is wrong? It might be the file, but I am hoping that something can be corrected/improved in the code :)


Code is this:

private void buttonProcessAllFiles_Click(object sender, EventArgs e)
{
    if (fileFolderPath == "")
    {
        MessageBox.Show("Load Folder First", "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);

    }
    else
    {
        int count = 0;
        //foreach (var file in Directory.GetFiles(fileFolderPath, "*.txt", SearchOption.AllDirectories))
        foreach (var file in Directory.GetFiles(fileFolderPath, "*.txt"))
        {
            count++;
            System.Diagnostics.Debug.WriteLine(count);
            fileFolderFull = Path.GetFullPath(file);
            System.Diagnostics.Debug.WriteLine(fileFolderFull);
            fileFolderName = Path.GetFileNameWithoutExtension(file);
            System.Diagnostics.Debug.WriteLine(fileFolderName);

            //MessageBox.Show("Full Folder: " + fileFolderFull);
            //MessageBox.Show("File Name: " + fileFolderName);

            var lines = File.ReadAllLines(fileFolderFull);

            var widthList = lines.First().GroupBy(c => c)
                                         .Select(g => g.Count())
                                         .ToList();

            var list = new List<KeyValuePair<int, int>>();

            int startIndex = 0;

            for (int i = 0; i < widthList.Count(); i++)
            {
                var pair = new KeyValuePair<int, int>(startIndex, widthList[i]);
                list.Add(pair);

                startIndex += widthList[i];
            }

            try
            {
                var csvLines = lines.Select(line => string.Join(",",
                                    list.Select(pair => line.Substring(pair.Key, pair.Value))));

                File.WriteAllLines(fileFolderPath + "\\" + fileFolderName + ".csv", csvLines);
            }
            catch (Exception ex)
            {
                System.Diagnostics.Debug.WriteLine(ex);
            } 
        }

        MessageBox.Show("File Saved", "Completed", MessageBoxButtons.OK, MessageBoxIcon.Information);
    }
}

The line where the error is this:

var csvLines = lines.Select(line => string.Join(",",
                                        list.Select(pair => line.Substring(pair.Key, pair.Value))));
Community
  • 1
  • 1
hshah
  • 842
  • 4
  • 14
  • 35

2 Answers2

3

Read through the stack trace. The first interesting location is:

Visual Studio 2010\Projects\FixedWidthFiles\FixedWidthFiles\Main.cs:line 138

This is your file, with your code. And the exception says, that you are reading from a string from an index that is larger than the string size. This all indicates a bug in your code. Go to that file and analyze that line. Once you think about that and have no idea why the index could be out-of-range at that place, copy that code and also some code around, and post it here. It is hard to tell anything more with just that stacktrace..

edit: You've added the code, cool!

As it definitely fails in substring, see what the substring takes as the index: it is the value from pair. The value comes from incremental summing of column widths, so most probably one of your input files simply has a line that ... is too short. Check your files for spurious empty lines at the beginning or end!

You have to either strip the files of those lines, or fix your code against it: instead of calling substring blindly, guard it with an if or Math.min:

str.Substring(
    Math.Min(str.Length, pair.Key), // here it MAY be needed a str.Length-1 instead!
    Math.Min(Math.Max(0,str.Length-pair.Key), pair.Value)
)

this way, for all lines that are too short, the field-cutter will return empty string. Note that it's good to guard both parameters and also it's worth to check the Length-StartIndex against negatives, as they StartIndex could possibly be greater than Length of empty line :)

btw. By StartIndex I of course mean the pair.Key..

quetzalcoatl
  • 32,194
  • 8
  • 68
  • 107
  • Most of what you said made no sense to me, but I will try and implement what you suggested in the morning :) – hshah Oct 02 '12 at 23:16
  • 1
    I'll explain the code a bit. What he is doing is clipping the values so it becomes impossible to generate this kind of error. Most people just use an if statement to skip the substring but he is working around it so the code executes normally. The argument of substring says "begin at value unless it is longer than the string, then use the strings length" the second argument says "either grab the amount of characters stored in value or if the string is not long enough, just grab everything" You can see this for yourself if you try it on paper. – Benjamin Danger Johnson Oct 02 '12 at 23:46
1

I think it's failing on the Substring method. Can you add a check that line.Length > (pair.Key + pair.Value)?

d89761
  • 1,434
  • 9
  • 11
  • Any chance you can show me how to do this please? I'm a little clueless with this :( – hshah Oct 02 '12 at 22:37
  • 1
    Maybe: var csvLines = lines.Select(line => string.Join(",", list.Where(pair => line.Length > (pair.Key + pair.Value)).Select(pair => line.Substring(pair.Key, pair.Value)))); – d89761 Oct 02 '12 at 22:46