1

I do not know if the heading is correct - so editing is allowed to make it proper.

Problem - using Vb.net code, when I read an excel file of 100,000 records, using connection string and sql query, it takes 3 minutes ( too long to me - I want a solution, please) to complete.

But, when I submit another excel file of 300,000 records ( my requirement is to read 50 Million records) - the time taken was more than 30 minutes ( I could not tolerate and killed the program)

Please help me understand this disparity and why it takes so long to read.

(I did not give any code samples because thousands of such sample codes are available on the net on how to establish a connection to a excel file ( Office 2010) and how to run a query to read a record )

Thanks in advance for your help and time. As a solution, I thought of chopping the 300,000 record file into files of 10,000 records each - but, how do I do that without wasting opening and reading time ?

Sabya

P.S - using core 2 duo with 8 GB RAM with Windows Server 2008 and Windows 7

  • Should you use only vb.net to read excel files? – Ishikawa Yoshi Aug 14 '12 at 13:41
  • What is the query that you are running (the query could be slow, not the raw open speed) – soandos Aug 14 '12 at 18:41
  • I agree with @soandos but it would be good to post the code that you're using to query and to open the Excel files. Depending on the issue here, it might be better suited for Stack Overflow. – James Mertz Aug 14 '12 at 18:43
  • 1
    Also to be honest, if you're working with something that is that large, you need to consider migrating to a real databasing system. Excel isn't a database. – James Mertz Aug 14 '12 at 18:48

4 Answers4

0

enter image description here


So, i don't work with vb.net but if you familiar with java i can advice you Apache POI library. POI load all data in memory and for my cases it works perfect, after that you can store it to mysql or anything else i read a hundred of files with poi and it helps me great. Here i find a question which looks like similar to yours.
And here you can find POI performance discussion.

And another solution can be to export excel file to csv and read it after that, i think it'll also fast.

Community
  • 1
  • 1
Ishikawa Yoshi
  • 1,779
  • 8
  • 22
  • 43
0
  • You could temporarily disable the Macro run as soon as Excel loads.
  • Memory limitation is another reason, as excel could use very large amount of memory. I would exhaust out the memory banks to 16GB if I am running this large spreadsheet (100K) cells).
  • Make sure the excel file and the hard drive is defragmented (you can see a real impact).
  • If never shutdown the PC, try shutdown and restart. This can liberate processes to unload unused dlls.
  • Increase the pagefile.sys size to at least 2.5 times RAM so that data transaction occurs smoothly.
C2940680
  • 111
  • 1
0
  1. Ishikawa asked, if vb.net is essential - my answer is yes, because, it is a part of an application written in VB.Net Framework 4.0. He also talked about exporting the excel to csv and try - but, I am afraid, if opening and reading is taking so many hours, ( it took 9 hours !!) - converting will not help. User will be killing the process - I am sure.

  2. Soandos asked for the query - it is - "Select top 1* from excel-file" - I am reading one by one. I think, the problem is not this query because this same query reads 100,000 records quite well.

  3. KronoS supported Soandos and I have answered above. To his/her 2nd point, the answer is - I have to have excel as - this is whatthe user provides. I can not change it.

  4. I do not see who answered this - but the idea of disabling Macros - is a very good point. Should I not disable all macro, all filters and unhide all - to read all data in simple way ? I will try that.

The total size of the 300,000 record excel file is 61 MB - it is not very large !! to create a memory problem ?

I found that the speed of simply reading records in excel is not linear. It reads 10,000 records in 4 sec but, reads 50,000 in 27 sec and 100,000 in 60 sec etc..I wish - if anyone can tell me how to index an excel file to read large files. I do not know what will be the problem size, when I get an excel file of 50 Million rows ?

0

I had similar problems with updating large excel file. my solution - update part of it, close, kill excel process, reopen, update again

        oexcel.DisplayAlerts = False
        obook.SaveAs(fnExcNew, 1)
        obook.Close()
        obook = Nothing
        KillExcel()
        oexcel = CreateObject("Excel.Application")
        oexcel.Workbooks.Open(fnExcNew)
        obook = oexcel.ActiveWorkbook
        osheet = oexcel.Worksheets(1)

Private Sub KillExcel()
    ' Kill all excel processes 
    Dim pList() As Process
    Dim pExcelProcess As System.Diagnostics.Process
    pList = pExcelProcess.GetProcesses
    For Each pExcelProcess In pList
        If pExcelProcess.ProcessName.ToUpper = "EXCEL" Then
            pExcelProcess.Kill()
        End If
    Next
End Sub
Brad Larson
  • 170,088
  • 45
  • 397
  • 571