2

I have a requirement to collate over 6000 csv files into a single csv document. The current VBA process is: 1. Open individual CSV data file 2. Load contents of file to array based on number of rows 3. Close individual CSV file 4. Process array

In order to improve efficiency of the code and processing, I was hoping there may be a method to load the data from the individual CSV files into an array without opening and closing every single file.

I am using Excel 2011 for Mac.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Benny Muller
  • 21
  • 1
  • 1
  • 2

3 Answers3

3

Ok I am assuming that All 6000 files have the same format.

My Test Conditions

  1. I have a folder called C:\Temp\ which has 6000 CSV Files
  2. All csv files have 40 Rows and 16 Columns
  3. Tested it in Excel 2010. Don't have access to 2011. Will test it in 2011 in approx 30 mins.

I ran the below code and the code took just 4 seconds.

Option Explicit

Sub Sample()
    Dim strFolder As String, strFile As String
    Dim MyData As String, strData() As String
    Dim FinalArray() As String
    Dim StartTime As String, endTime As String
    Dim n As Long, j As Long, i As Long

    strFolder = "C:\Temp\"

    strFile = Dir(strFolder & "*.csv")

    n = 0

    StartTime = Now

    Do While strFile <> ""
        Open strFolder & strFile For Binary As #1
        MyData = Space$(LOF(1))
        Get #1, , MyData
        Close #1

        strData() = Split(MyData, vbCrLf)
        ReDim Preserve FinalArray(j + UBound(strData) + 1)
        j = UBound(FinalArray)

        For i = LBound(strData) To UBound(strData)
            FinalArray(n) = strData(i)
            n = n + 1
        Next i

        strFile = Dir
    Loop

    endTime = Now

    Debug.Print "Process started at : " & StartTime
    Debug.Print "Process ended at : " & endTime
    Debug.Print UBound(FinalArray)
End Sub

Screenshot of the folder

enter image description here

Screenshot of the Code Output

enter image description here


UPDATE

Ok I tested it in MAC

My Test Conditions

  1. I have a folder called Sample on the desktop which has 1024 CSV Files
  2. All csv files have 40 Rows and 16 Columns
  3. Tested it in Excel 2011.

I ran the below code and the code took LESS THAN 1 second (since there were only 1024 files). So I am expecting it to again run for 4 secs in case there were 6k files

Sub Sample()
    Dim strFile As String
    Dim MyData As String, strData() As String
    Dim FinalArray() As String
    Dim StartTime As String, endTime As String
    Dim n As Long, j As Long, i As Long

    StartTime = Now

    MyDir = ActiveWorkbook.Path
    strPath = MyDir & ":"

    strFile = Dir(strPath, MacID("TEXT"))

    'Loop through each file in the folder
    Do While Len(strFile) > 0
        If Right(strFile, 3) = "csv" Then
            Open strFile For Binary As #1
            MyData = Space$(LOF(1))
            Get #1, , MyData
            Close #1

            strData() = Split(MyData, vbCrLf)
            ReDim Preserve FinalArray(j + UBound(strData) + 1)
            j = UBound(FinalArray)

            For i = LBound(strData) To UBound(strData)
                FinalArray(n) = strData(i)
                n = n + 1
            Next i

            strFile = Dir
        End If
        strFile = Dir
    Loop

    endTime = Now

    Debug.Print "Process started at : " & StartTime
    Debug.Print "Process ended at : " & endTime
    Debug.Print UBound(FinalArray)
End Sub

Screenshot of the folder

enter image description here

Screenshot of the Code Output

enter image description here

Siddharth Rout
  • 147,039
  • 17
  • 206
  • 250
  • Thank you Siddharth, this looks to be a very comprehensive answer, I will test and revert. Appreciated! – Benny Muller Feb 18 '13 at 14:50
  • Hi Siddharth, I have tested the suggested code and the loop is never entered as strFile has value "". The workbook is saved in the folder of the relevant CSV files. – Benny Muller Feb 20 '13 at 23:04
2

You don't need to use Excel to do this, you can merge using windows copy from the command prompt by entering:

copy *.csv mergedfilename.csv
-1

There's not an Excel answer for your problem, in my opinion - certainly not within the normal definition thereof, anyhow.

The correct method to solve it is to use a programming language that is appropriate for the task; perl, for example, or even command shell, to combine the files. Excel isn't made for constant file i/o, but perl is quite good at handling lots of files. I performed a project similar to this (combining millions of files) in a few minutes on a relatively small unix server.

You can also use command shell to cat the files together (cat=concatenate), as nneonneo suggests in comments; I couldn't say which is faster. Perl certainly would take longer to code, especially if you have to learn perl first (though there are lots of examples on the 'net).

Joe
  • 62,789
  • 6
  • 49
  • 67