0

I’m hoping that someone may be able to help me.

My question relates to SISS, specifically the Lookup Data Flow Item and how it handles NULL values depending on the selected Cache Mode.

I have a very large dataset (72 columns, 37,000,000 records) which uses a Type 2 update methodology.

I use a lookup in the data flow to identify updates to existing record, I match on all of the relevant fields and if all the fields match then obviously the incoming record matches the existing record in the table and it is therefore discarded. If there isn’t a match then a type 2 update is performed.

Due to the large dataset and limited server resources if the Cache Mode of the Lookup is set to Full Cache, it causes the process to fail due to insufficient memory; I have therefore had to switch the Cache Mode to Partial Cache. This resolves the memory issue, but causes another issue. For some reason in Partial Cache mode a NULL value from the table does not match a NULL value in the incoming records, while if the Cache Mode is set to Full Cache then it does.

This behaviour seams quite odd and I am unable to find it documented anywhere. One way round it could be to coalesce the NULL values, but this is something I would like to avoid.

Any help would be much appreciated.

Cheers Ben

Ben
  • 1
  • 1
  • http://stackoverflow.com/questions/7233766/ssis-lookup-is-not-dealing-with-nulls-like-the-docs-say-it-should/7236036#7236036 Documentation does exist ;) – billinkc Jul 18 '14 at 14:55
  • Do you really need to pull back all 72 columns in your lookup? I, sadly, had to regularly pull back comparable data volumes for a lookup but I was able to distill it down to 2 hashes - my match key and a change detection key – billinkc Jul 18 '14 at 14:57
  • I guess I was looking for an explanation as to why NULLs match in Full Cache Mode, but they do not match in Partial Cache Mode. I don't suppose you have any idea why this is the case? – Ben Jul 18 '14 at 15:02
  • Why? Because the [msdn](http://msdn.microsoft.com/en-us/library/ms141821(v=sql.105).aspx) docs state that they do in full cache mode and don't in partial/none. Why they made that design decision is beyond my paygrade – billinkc Jul 18 '14 at 15:12

1 Answers1

1

No Cache and Partial Cache Modes use the database engine to match. In most database engines (SQL Server included) NULL does not equal NULL. NULL means an unknown value so you will never get a match. Do an isnull on all your nullable col