0

I have about 5000 .csv files and I want to search for one row in each file and extract it. I have pasted the key part of code below, which works, but as I have to open and close each .csv file, the process is slow for 5000 files. Is there any way to read a csv file without opening it? I had considered writing a small script to convert each csv file to Excel first? Thx.

SP_File_Name = Dir(DN_Path & "*.*")
Count = 1

Set START_CELL_RANGE = TARGET_SP_SHEET.Range("B3")
Set TICKER_CODE_RANGE = TARGET_SP_SHEET.Range("B1")


   While (SP_File_Name <> "")
    SP_Full_Path = DN_Path & SP_File_Name
    Workbooks.OpenText Filename:=SP_Full_Path, DataType:=xlDelimited, comma:=True, Local:=True

    Set INPUT_WORKBOOK = ActiveWorkbook
    Set INPUT_SHEET = INPUT_WORKBOOK.Worksheets(1)
    INPUT_SHEET.Range("$A$1").Select
    Set INPUT_RANGE = ActiveCell.CurrentRegion

    Set INPUT_FIRST_MATCH_RANGE = INPUT_RANGE.Find(TICKER_CODE_RANGE)

    If INPUT_FIRST_MATCH_RANGE Is Nothing Then
    GoTo NOT_FOUND
    End If


    START_CELL = START_CELL_RANGE.Address

    TARGET_SP_SHEET.Range(START_CELL_RANGE.Address, START_CELL_RANGE.Offset(0, 6).Address).Value = INPUT_SHEET.Range(INPUT_FIRST_MATCH_RANGE.Address, INPUT_FIRST_MATCH_RANGE.Offset(0, 7).Address).Value

    ' write diagnostics
    Sheet5.Range("K" & Count + 4).Value = START_CELL
    Sheet5.Range("L" & Count + 4).Value = "$A$1"
    Sheet5.Range("M" & Count + 4).Value = INPUT_FIRST_MATCH_RANGE.Address
    Sheet5.Range("N" & Count + 4).Value = INPUT_FIRST_MATCH_RANGE.Offset(0, 7).Address

NOT_FOUND:
    Set START_CELL_RANGE = START_CELL_RANGE.Offset(1, 0)

    Workbooks(SP_File_Name).Close SaveChanges:=False
    SP_File_Name = Dir
    Count = Count + 1

  Wend
Community
  • 1
  • 1
RockDr
  • 3
  • 3
  • 3
    This might help. Seems like a similar issue. http://stackoverflow.com/questions/14907952/load-contents-of-csv-file-to-array-without-opening-file – Captain Grumpy Sep 16 '16 at 00:41
  • 2
    How would you expect to read the contents of a file without opening it? Are you under the impression that VBA has psychic abilities? Can you read the words in a book without opening the cover? Quickly - give me the 20th word on page 100 of "I, Robot". Converting the CSV files to Excel format first won't help; you *still* have to open each file to read the contents. – Ken White Sep 16 '16 at 01:27
  • I would probably use the Windows Cmd Shell command `findstr`. You could output all the lines to a file and then read that file into Excel or, with more involved programming, output directly into Excel. And you can run the script from VBA. You could probably also use PowerShell, but I'm not as familiar with that. – Ron Rosenfeld Sep 16 '16 at 01:38
  • Ken - sounds like your having a bad day, not sure why you decided to vent on me! And while I am no expert, Excel doesn't need to actually open each linked file to update links to various sheets, or to extract data from other data sources. The process of actually opening and closing a sheet adds significant time to code. – RockDr Sep 16 '16 at 08:07
  • Ron - thanks much more useful! – RockDr Sep 16 '16 at 08:09
  • Captain Grumpy - also very helpful thx. – RockDr Sep 16 '16 at 08:11
  • @RonRosenfeld The findstr command works beautifully when I run it from a batch file, completes in seconds rather than hrs using the above code. How do you call this from VBA, I have tried: Call Shell("findstr/c:" & TICKER_CODE_RANGE.Value & SP_Full_Path & " >" & TICKER_CODE_RANGE & ".csv", vbNormalFocus) (where SP_FULL_Path is the full file name and TICKER_CODE is the seacrh string but I keep getting a file not found error? Thx – RockDr Sep 17 '16 at 00:42

2 Answers2

0

To call a cmd command from VBA, I have used WshShell. For early binding I set a reference to the Windows Script Host Object Model

One problem with the Shell function is that it runs asynchronously. By using the WshShell Run method, you can have it wait until finished before executing subsequent commands.

Sample code might look as follows:

Option Explicit
Sub foo()
 Dim WSH As WshShell
 Dim lErrCode As Long

Set WSH = New WshShell
lErrCode = WSH.Run("cmd /c findstr /C:""Power"" ""C:\Users\Ron\filelist.txt"" > ""C:\Users\Ron\Results2.txt""", 1, True)
    If lErrCode <> 0 Then
        MsgBox "Error Code: " & lErrCode
        Stop
    End If
Set WSH = Nothing

Call Shell
End Sub

With regard to your command that you showed in your comment, I would ensure that VBA is interpreting the string correctly for the cmd prompt. Looking at your code line, I would wonder whether you are missing a space between the search string and the file path.

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
  • Thanks @RonRosenfeld I can get this to work as long as I use hard coded paths and file names. Is there trick to passing path names and search strings via string variables? I must admit I struggled with the use of double quotes and reverted to single quotes and &. Cheers – RockDr Sep 17 '16 at 09:24
  • @RockDr No trick so far as I know. I usually do NOT use hard coded file paths. Construct your path into a variable, and it should work just fine. In the command above, I don't really need the doubled quotes, since there are no spaces in the path -- just showing them to demonstrate their use. Ensure that the line you construct from your variables runs as expected, by copy/pasting it into the cmd shell window. That should show you where the problem is. – Ron Rosenfeld Sep 17 '16 at 10:29
  • thx seems to work now, must have been a typo previously. I still cant get it to use full path names, mainly because of the spaces, possibly other characters also. This is the path: C:\Users\craig_000\OneDrive - Moulton Metals Pty Ltd\Moulton Metals\MSc Mineral Economics\Capstone\TXT_Data I have tried doing a replace searching for " " and replacing with %20 but that doesn't seem to help? Of course if I use double quotes it interprets my stribng variable as literal text.. thx heaps for you help :) – RockDr Sep 19 '16 at 02:26
  • @RockDr `%20` has no special meaning in VBA, so I would not expect it to help. You just need to ensure, that when you construct your command string, that the entire file name is surrounded by quotes, so it comes out looking as if your full path is quoted. If you have it broken up into several variables, just make sure the first one starts with a quoted quote, and the last one ends with a quoted quote. I usually embed those in the command line, although you could include them in your variables. – Ron Rosenfeld Sep 19 '16 at 02:55
0

I don't think you can read the contents of a file without opening it. Why not just merge all 5000 files into 1 single file and read that into Excel. Certainly that will be much faster. Use the Command Window, point it to the folder that contains all 5000 files, and enter this:

copy *.csv merge.csv

See the link below for an example.

http://analystcave.com/merge-csv-files-or-txt-files-in-a-folder/