0

In an ssis Dataflow there is a lookup component ,which lookup on a table with 18 million records.I have configured the lookup with full cache.

   Default buffer size :20485760 
Default Buffer Max rows: 100000 
The lookup join is based on an ID column of varchar(13)type

It gives an error as shown below.What lookup configuration is suitable to cache these many records

Error: The buffer manager cannot write 8 bytes to file "C:\Users\usrname\AppData\Local\Temp\16\DTS{B98CD347-1EF1-4BC1-9DD9-C1B3AB2B8D73}.tmp". There was insufficient disk space or quota.

what would be the difference in performance if i use a lookup with no cache? I did understand that in full cache mode ,the data is cached before pre execute stage and do not have to go back to database.This full cache memory takes large amount of memory and add aditional startup time for the data flow.My question is what configuration do i have to setup in order to handle large amount of data in full cache mode

Whats the solution if the lookup table has million records (and they dont fit in a full cache)

user1254579
  • 3,901
  • 21
  • 65
  • 104

2 Answers2

0

Use a Merge Join component instead. Sort both inputs on the join key, specify inner/left/full joins based on your specification. Use the different outputs to get functionality like the lookup component.

Merge Join usually performs better on larger datasets.

cairnz
  • 3,917
  • 1
  • 18
  • 21
  • But..it would decrease the performance rt?..Sorting is an expensive operation ,also in this case i need to use 2 sort components ! – user1254579 Mar 12 '14 at 15:28
  • 1
    Sorting in your source systems is much faster than using both the lookup and the sort component in ssis. Sort in the source (use SQL statements, ORDER BY) and use advanced editor to specify that the output is sorted, and specify the sort keys. The Merge Join will pick up this metadata and apply the keys for joining automatically. – cairnz Mar 13 '14 at 13:15
  • 1
    See here for details: http://www.ssistalk.com/2009/09/17/ssis-avoiding-the-sort-components/ (was too late to edit comment) – cairnz Mar 13 '14 at 13:21
0

You can set Buffertempstoragepath property in SSIS to some of the fastdrives as Blobtempstoragepath and buffertempstoragepath will be using temp and tmp system variables. So if the tmp variable cannot hold the large dataset in you case, its using lookup transformation. So the largedataset will be using the drive space and will perform the job for you.

Pavan
  • 1