3

I have a collection of fixed width files with a varying number of columns and field sizes.

The top of the file starts with a line like:

AAAAABBCCCCCCCCCCDDD and so on

The change in character denotes the end of one field and the start of another. I am guessing this can someone be used to work out what the field sizes are with code and then apply the same values to the actual data lines below.

I then want to output all the read data into a XLS file or even a DataGrid, but my issue is I have no idea how to code this.

Any help would be greatly appreciated :)


/Edit:

I implemented Cuong's solution and although that worked fine for testing on my home PC, I had to compile it with c# v4 as our work PCs have Windows XP.

Anyways, when reading the input file I get the following error:

************** Exception Text **************
System.ObjectDisposedException: Cannot read from a closed TextReader.
   at System.IO.__Error.ReaderClosed()
   at System.IO.StreamReader.ReadLine()
   at System.IO.File.<InternalReadLines>d__0.MoveNext()
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at System.IO.File.InternalWriteAllLines(TextWriter writer, IEnumerable`1 contents)
   at System.IO.File.WriteAllLines(String path, IEnumerable`1 contents)
   at FixedWidthFiles.Main.buttonProcessFile_Click(Object sender, EventArgs e)
   at System.Windows.Forms.Control.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
   at System.Windows.Forms.Control.WndProc(Message& m)
   at System.Windows.Forms.ButtonBase.WndProc(Message& m)
   at System.Windows.Forms.Button.WndProc(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)


************** Loaded Assemblies **************
mscorlib
    Assembly Version: 4.0.0.0
    Win32 Version: 4.0.30319.1 (RTMRel.030319-0100)
    CodeBase: file:///C:/WINDOWS/Microsoft.NET/Framework/v4.0.30319/mscorlib.dll
----------------------------------------
FixedWidthFiles
    Assembly Version: 1.0.0.0
    Win32 Version: 1.0.0.0
    CodeBase: file:///C:/TEMP/FixedWidthFiles.exe
----------------------------------------
System.Windows.Forms
    Assembly Version: 4.0.0.0
    Win32 Version: 4.0.30319.1 built by: RTMRel
    CodeBase: file:///C:/WINDOWS/Microsoft.Net/assembly/GAC_MSIL/System.Windows.Forms/v4.0_4.0.0.0__b77a5c561934e089/System.Windows.Forms.dll
----------------------------------------
System.Drawing
    Assembly Version: 4.0.0.0
    Win32 Version: 4.0.30319.1 built by: RTMRel
    CodeBase: file:///C:/WINDOWS/Microsoft.Net/assembly/GAC_MSIL/System.Drawing/v4.0_4.0.0.0__b03f5f7f11d50a3a/System.Drawing.dll
----------------------------------------
System
    Assembly Version: 4.0.0.0
    Win32 Version: 4.0.30319.1 built by: RTMRel
    CodeBase: file:///C:/WINDOWS/Microsoft.Net/assembly/GAC_MSIL/System/v4.0_4.0.0.0__b77a5c561934e089/System.dll
----------------------------------------
System.Core
    Assembly Version: 4.0.0.0
    Win32 Version: 4.0.30319.1 built by: RTMRel
    CodeBase: file:///C:/WINDOWS/Microsoft.Net/assembly/GAC_MSIL/System.Core/v4.0_4.0.0.0__b77a5c561934e089/System.Core.dll
----------------------------------------
hshah
  • 842
  • 4
  • 14
  • 35

3 Answers3

1

Use the class TextFieldParser type per this how-to for Visual Basic.

You'll need to tell it the widths of the fields. Given your first line, that's easy enough.

Colonel Panic
  • 132,665
  • 89
  • 401
  • 465
1

You can use TextReader.Read method. For example:

    string input; // Test string

    // Replace new StringReader with a StreamReader to read a file
    using (TextReader textReader = new StringReader(input))
    {
        // Read first line to get structure
        var groupings = textReader.ReadLine().GroupBy(x => x);

        while (textReader.Peek() != -1)
        {
            // Convert to a string for easier handling than char[]
            List<string> fields = new List<string>();

            // Get the fields on each ling
            foreach (IGrouping<char, char> grouping in groupings)
            {
                char[] field = new char[grouping.Count()];
                textReader.Read(field, 0, field.Length);
                fields.Add(new string(field));
            }

            // Do something with "fields". The name of each field is
            // in grouping.Key at the same index.

            // Move to next line
            textReader.ReadLine();
        }
    }
akton
  • 14,148
  • 3
  • 43
  • 47
  • What if I don't know the field lengths? How can the first line in my fixed width files be used to determine this? – hshah Sep 23 '12 at 06:29
  • @hshah I have updated the answer to read the schema from the first line as outlined in the question. The answer also converts them into strings for easier handling. – akton Sep 23 '12 at 07:26
1

Below example how to write in csv file (excel type), the main point is you need to read the first line and calculate width:

var lines = File.ReadLines("C:\\input.txt");

var widthList = lines.First().GroupBy(c => c)
                             .Select(g => g.Count())
                             .ToList();

var list = new List<KeyValuePair<int, int>>();

int startIndex = 0;

for (int i = 0; i < widthList.Count(); i++)
{
    var pair = new KeyValuePair<int, int>(startIndex, widthList[i]);
    list.Add(pair);

    startIndex += widthList[i];
}

var csvLines = lines.Select(line => string.Join(",", 
                    list.Select(pair => line.Substring(pair.Key, pair.Value))));

File.WriteAllLines("C:\\test.csv", csvLines);
cuongle
  • 74,024
  • 28
  • 151
  • 206
  • @hshah: it works on your home computer but does not work on your work computer with XP? – cuongle Sep 24 '12 at 09:04
  • That is right. Home PC is running Windows 7 and VS2012 and work PC has Windows XP with all the .Net Frameworks. The test I did at home was slightly different because the input file is not the same... I can't take work data home, but I did replicate the format exactly so I can see no reason why it doesn't work. – hshah Sep 24 '12 at 09:08
  • The program was compiled using C# v5.0 but that doesn't work on Windows XP machines, so I had to change it to v4.0. Does that make a difference? – hshah Sep 24 '12 at 09:09
  • Well, I this code surely work on .NET 4.5, I actually don't what different? Please could you open new question, maybe other people know? – cuongle Sep 24 '12 at 09:29
  • Question created: http://stackoverflow.com/questions/12562414/system-objectdisposedexception-error-on-c-sharp-v4-0 – hshah Sep 24 '12 at 09:40
  • Sorry to bug you again, but if I wanted to adapt this to process all .txt files in one folder and save a csv of the same name, how would I do that? – hshah Oct 02 '12 at 10:05
  • @hshah: Could you mind to post another question? :) maybe other will answer for you – cuongle Oct 02 '12 at 10:10
  • I managed to get the loop working, but during the loop it fails on processing some files with this error on the var csvLines: A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll – hshah Oct 02 '12 at 22:07
  • I have posted this: http://stackoverflow.com/questions/12699298/system-argumentoutofrangeexception-in-c-sharp-application – hshah Oct 02 '12 at 22:26
  • Hi, just noticed that this does not work when the top line also includes numbers. So it could have AAABBCCCCCC1111DDDDDD. – hshah Oct 08 '12 at 08:11
  • @hshah: weird, but you can post another question to ask – cuongle Oct 08 '12 at 08:17
  • Posted this: http://stackoverflow.com/questions/12778173/c-sharp-processing-fixed-width-files-solution-not-working – hshah Oct 08 '12 at 08:47