0

I have a vb.net application which calls a large quantity of data (about 65,000 rows) from two tables in a SQL Server CE database.

The code is pretty straightforward:

cmd.CommandText = "SELECT [Table1Col1], [Table1Col2], ... [Table1Col8], " & _
                  "[Table2Col1], [Table2Col2] FROM [Table1] LEFT JOIN [Table2] " & _
                  "ON [Table1].[LinkCol] = [Table2].[LinkCol] WHERE [Table1Col8] = 0 " & _
                  "ORDER BY [Table1].[LinkCol]"

reader = cmd.ExecuteReader()

Do While reader.read
     [read data, format it, etc]
Loop

The reader.read statement by far takes the most time to execute. It takes about 2 hours to read and process the 65,000 rows, and I estimate that about 60-70% of that is from the lag in the reader.read statement.

Is there a way to speed this up?

EDIT:

I did some time measurements, and the reader.read statement takes 150 times longer (on average) than the block of code where I read, format, etc.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
DrCocoa
  • 37
  • 1
  • 10
  • This seems too much. Are you sure that your _[read data, format it, etc]_ has nothing to do with this slowness? As usual with performance issues there is only one thing to do: measure. – Steve Nov 11 '15 at 20:02
  • I just went back and measured it, and the reader.read statement takes (on average) 150 times longer than the _[read data, format it, etc]_ block. – DrCocoa Nov 11 '15 at 20:18
  • Do you have indexes on the two LinkCol columns? – ErikEJ Nov 11 '15 at 20:23
  • No. Would indexing those two columns help much when I'm essentially returning the whole of table1 and table 2? I'm afraid I only know a bit about indexing. – DrCocoa Nov 11 '15 at 20:44
  • Try to get rid of the sort – ErikEJ Nov 11 '15 at 20:49
  • I removed the `ORDER BY [Table1].[LinkCol]` statement and the performance was exactly the same. – DrCocoa Nov 11 '15 at 20:54

1 Answers1

0

The issue appears to be because of the LEFT JOIN in the sql statement. I gained massive speed up by executing two readers, one for each table, and syncing them in the code. Something like the following:

cmd1.CommandText = "SELECT [LinkCol], [Table1Col1], [Table1Col2], ... [Table1Col8] " & _
                  "FROM [Table1] WHERE [Table1Col8] = 0 ORDER BY [Table1].[LinkCol]"

cmd2.CommandText = "SELECT [LinkCol], [Table2Col1], [Table2Col2] FROM [Table2] " & _
                  "ORDER BY [LinkCol]"

reader1 = cmd1.ExecuteReader() 
reader2 = cmd2.ExecuteReader()

Do While reader1.read
     tbl1ID = cint(reader1("LinkCol"))
     do while tbl1ID <> tbl2ID
         reader2.read()
         tbl2ID = cint(reader2("LinkCol"))
     loop
     [read data, format it, etc]
Loop

This is a simplified version without all the checks to avoid errors and ensure the readers stay in-sync, but it may be helpful to someone who encounters a similar issue where using one reader on two joined tables results in performance issues.

DrCocoa
  • 37
  • 1
  • 10