0

I have a hundreds of space delimited text files that I would like to convert in VB.NET to CSV-files according to RFC 4180. The files have variying length but can contain upp to 200000 rows and have varying ammount of columns (up to 30) separated by a number of "spaces" that also varies between one and 20. The files contains some information that I would like to remove, for example the first column and I would also like to change the content of the second column into a valid Javascript time stamp. This is an exemple of the first rows of one of the files (note that most of the spaces are not shown).

# time order boil_q_1 boil_q_2 chil_q1 chil_q2 loccool locheat qdomwat
0.000000000 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31529E-03 132.39 9799.3 0.0000
8.0000000000E-02 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31528E-03 132.11 9917.1 0.0000
0.1600000000 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31527E-03 131.98 10047. 0.0000
0.2705515735 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31527E-03 131.97 10152. 0.0000
0.2705515763 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31526E-03 131.97 10152. 0.0000
0.3345515763 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31525E-03 131.97 10184. 0.0000
0.3985515763 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31524E-03 131.98 10192. 0.0000
0.5265515763 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31524E-03 131.98 10178. 0.0000
0.7825515763 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31523E-03 131.99 10164. 0.0000
0.7825515791 1.0000 4.18700E-09 0.0000 4.18700E-11 1.31522E-03 131.99 10164. 0.0000

Any suggenstions would be wellcome.

Sincerely Max

user2404987
  • 11
  • 1
  • 4
  • Welcome to Stack Overflow. Can you show us your current attempt at handling this so we can know what to help with? – xxbbcc Jun 10 '13 at 14:53
  • I thought of reading line by line and splitting the columns by using Dim split As String() = Text.Split(" ") Dim s As String For Each s In split If s.Trim() <> "" Then Console.WriteLine(s) End If Next s but there must be way better methods. – user2404987 Jun 10 '13 at 15:14
  • That's probably the easiest way to do it. You'll have to output commas and wrap field values in double quotes as you write the CSV. – xxbbcc Jun 10 '13 at 15:19
  • Since you don't know how many spaces there would be, I'd say your best bet would be using `Regex.Replace` using the regex of `\s*` with your new delimitter - `,` for csv. – John Bustos Jun 10 '13 at 16:01
  • Sounds great, never heard of Regex.Replace – user2404987 Jun 10 '13 at 16:03
  • I ended up using the following code and adding the information to a datagrid – user2404987 Jun 14 '13 at 06:38

2 Answers2

0

Assuming the varying number of spaces is not because the columns are empty...

I'd just keep it simple and use the string tools I'm used to, crude but quick for me:

    Dim s As String = TextBox1.Text
    s = s.Replace(vbNewLine, "|").Replace(vbCr, "").Replace(vbLf, "")
    s = s.Replace("| ", "|") ' trim leading space
    Dim iLen As Integer = -1
    Do
        iLen = s.Length
        s = s.Replace("  ", " ")
    Loop Until iLen = s.Length
    Dim aLines() As String = s.Split("|")
    Dim aLine() As String
    Dim aHeader() As String = aLines(0).Split(" ")
    ' process header
    For i As Long = 1 To aLines.GetUpperBound(0)
        aLine = aLines(i).Split(" ")
        Stop
        ' process line
    Next

TextBox1 contains your sample with the stray? leading #. If the varying spaces are due to empty columns the massaging will be more complex and maybe better accomplished in the inside loop.

Above I just massage the entire string then place the lines into an array and then loop over the lines.

This is the only the first step in the OP.

I've found writing CSV is problematic especially as the consumers sometimes have different ideas on what the spec says. If possible test with the ultimate consuming program.

rheitzman
  • 2,247
  • 3
  • 20
  • 36
0

I ended up using the following code and adding the information to a datagrid

Dim rowvalue As String
Dim streamReader As IO.StreamReader = New IO.StreamReader(FileName)
    While streamReader.Peek() <> -1
    rowvalue = streamReader.ReadLine()
    DataGridView1.Rows.Add(rowvalue.Split(New String() {" "}, StringSplitOptions.RemoveEmptyEntries))
End While

The problem is that it is rather slow and I still need to remove one of the coumns and adding all the other files. I guess it would be easier to store the information some other way? Any ideas?

Taryn
  • 242,637
  • 56
  • 362
  • 405
user2404987
  • 11
  • 1
  • 4