-2

Need to assign a unique group Id based on the combination of multiple columns values from same table. Values are sometimes unique.

Any form of SQL, SSIS, DQS will work for below request. Tried DQS but does not reach to destination. Table has about 2+ millions rows.

For example first_name, last_name, and Email_address values match but PhoneNumber is different. Everything match but customerId might be different.

Below are sample data. Final output in last column called finalgroupid.

customerid  EmailAddress    First_Name  Middle_Name LastName    Address                                     Phone       Final Group ID 
100         j@A.cooom           JNE                  PIN    1 Wills PL Australia NJ 90210                   1111111111      A123
206                             JNE                  PIN    1 Wills PL Australia NJ 90210                                   A123
206         j@A.cooom           JNE                  PIN    1 Wills PL Australia NJ 90210                   1111111111      A123
206         j@A.cooom           JNE                  PIN    1 Wills PLACE Australia NJ 90210                1111111111      A123
206         j@A.cooom           JNE                  PIN    22 OX ST EW #A HURAT MA 30000                   1111111111      A123
206         j@A.cooom           JNE                  PIN    22 OX ST EW UNIT A HURAT MA 30000               1111111111      A123
206         j@A.cooom           JNE                  PIN    22 OX ST EW UNIT A HURAT MA 30000               2222222222      A123
206         j@A.cooom           JNE                  PIN    C/O I AM THE BEST 35 FABULOUS OUTBACK CT 12345  1111111111      A123
206         j@A.cooom           JNE                  PIN    C/O I AM THE BEST 35 FABULOUS OUTBACK CT 12345  2222222222      A123
712         j@A.cooom           And                  PIN    1 MOUNT Everest Frace GA 54358                  2222222222      A123
712         j@A.cooom           JNE                  PIN    1 Wills PL Australia NJ 90210                   1111111111      A123
791         MA@nose.cooom       M H                  JSN    63 AL RD VALLEY NM 10001                        8888888888      A124
791         MA@nose.cooom       MAR                  JSN    21 SUPERHIT CIR BOMBAY HI 11231                 8888888888      A124
791         MA@nose.cooom       MAR     H            JSN    63 AL RD VALLEY NM 10001                        8888888888      A124
799         MA@hair.cooom       M H                  JSN    63 AL RD VALLEY NM 10001                        8888888888      A124
799         MA@hair.cooom       MAR     H            JSN    63 AL RD VALLEY NM 10001                        8888888888      A124
799         MA@hair.cooom       SEA                  JSN    21 SUPERHIT CIR BOMBAY HI 11231                 8888888888      A124
805         MA@nose.cooom       M H                  JSN    27 W WOOD Mtn VALLEY NM 10000                   8888888888      A124
805         MA@nose.cooom       MAR     H            JSN    27 W WOOD Mtn VALLEY NM 10000                   8888888888      A124
805         MA@nose.cooom       V P                  C H    21 SUPERHIT CIR BOMBAY HI 11231                 8888888888      A124
805         MA@nose.cooom       VICTOR               HKS    21 SUPERHIT CIR BOMBAY HI 11231                 8888888888      A124
805         MA@nose.cooom       VICTOR  P            CAS    21 SUPERHIT CIR BOMBAY HI 11231                 8888888888      A124
809         MA@nose.cooom       M H                  JSN    27 W WOOD Mtn VALLEY NM 10000                   8888888888      A124
809         MA@nose.cooom       MAR                  STK    21 SUPERHIT CIR BOMBAY HI 11231                 8888888888      A124
809         MA@nose.cooom       MAR     H            JSN    27 W WOOD Mtn VALLEY NM 10000                   8888888888      A124
815         k@Y.cooom           M H                  JSN    27 W WOOD Mtn VALLEY NM 10000                   8888888888      A124
815         k@Y.cooom           MAR     H            JSN    27 W WOOD Mtn VALLEY NM 10000                   8888888888      A124
815         k@Y.cooom           MAV                  JSN    21 SUPERHIT CIR BOMBAY HI 11231                 8888888888      A124
815         k@Y.cooom           MAV                  JSN    27 W WOOD Mtn VALLEY NM 10000                   8888888888      A124

Please help. Thanks in advance.

Felix Pamittan
  • 31,544
  • 7
  • 41
  • 67
  • Your match logic isnt clear. Please explain how you group the data. – Juan Carlos Oropeza Sep 15 '15 at 04:27
  • First try to match LastName & FirstName then match phone then match email then address & then customerid. – user749525 Sep 15 '15 at 14:06
  • Again what is the logic? If matach LastName and FirstName have a single group ID? – Juan Carlos Oropeza Sep 15 '15 at 15:36
  • LastName & FirstName Match give one interim GroupId 1 Then Match Phone# and give interim GroupId 2 Then Match Email and give Interim GroupId 3 Then match Address and give Interim GroupId4 Then Match CustomerId and give Interim GroupId5 Based on all these GroupdIDs decide that this is a single customer and give that customer Unique group ID. – user749525 Sep 15 '15 at 16:41
  • You still havent provide a logic. Please read [**How to ask**](http://stackoverflow.com/help/how-to-ask) And [**How to create a Minimal, Complete, and Verifiable example.**](http://stackoverflow.com/help/mcve) – Juan Carlos Oropeza Sep 15 '15 at 16:43
  • LastName & FirstName Match give one interim GroupId 1 Then Match Phone# and give interim GroupId 2 Then Match Email and give Interim GroupId 3 Then match Address and give Interim GroupId4 Then Match CustomerId and give Interim GroupId5 Based on all these GroupdIDs decide that this is a single customer and give that customer Unique group ID. – user749525 Sep 15 '15 at 16:54

1 Answers1

0

You can use DENSE_RANK() SQL Server function for your requirement

Here is an example

select 
    groupno = DENSE_RANK() over (order by col1, col2), *
from myTable
Eralper
  • 6,461
  • 2
  • 21
  • 27