-2

I have several tables in a SQL Server database, two of them (Table1 and Table2) I would like to select a specific subset from, to fill in a third table (Table3).

In Table1 there are 25 columns, from which I am only interested in three, let's call them Col1, Col2 and Col3. All three are not unique in this table, but I would like to extract the unique pairs, as follows:

  • Col1 + Col2 = Unique Key for Table3.
  • Col3 + Col2 = Optional, foreign key into Table2.

To extract the unique keys for Table3 from Table1 the following SQL works fine:

SELECT Col1, Col2 
FROM Table1 
GROUP BY Col1, Col2

However this is missing Col3. The first problem is that Col3 can't simply be added as part of the GROUP BY as there can be different values for it, which causes duplicate combinations of Col1 + Col2 to be returned.

This is where Table2 comes into play; Col3 + Col2 form a unique key into Table2, but not every combination is present (which is helpful), as a JOIN can be used to filter away the invalid combinations:

SELECT a.Col1, a.Col2, a.Col3 
FROM Table1 a
JOIN Table2 b ON b.Col3 = a.Col3 AND b.Col2 = a.Col2
GROUP BY a.Col1, a.Col2, a.Col3

Now my final problem, unfortunately there are some (very few) combinations that do result in duplicate Col1 + Col2 keys for Table3.

If we assume it is okay to lose some Col3 values, how can I write a SELECT to extract the three columns, ensuring that the combination Col1 + Col2 is unique? And if possible keeping a Col3 value that provides a valid key combination in Table2.

I've messed about with adding TOP 1 but I've failed in getting anything to work to my liking...

EDIT: Example data as requested.

Table1
| Col1 | Col2 | Col3 |
|  100 |   00 |  010 |
|  100 |   10 |  020 |
|  200 |   00 |  030 |
|  300 |   00 |  040 |
|  300 |   00 |  040 |
|  400 |   10 |  050 |
|  400 |   10 |  060 |
|  400 |   10 |  070 |

Table2
| Colx | Col2 | Col3 |
|  car |   00 |  010 |
|  cat |   10 |  030 |
|  dog |   00 |  040 |
|  bee |   10 |  040 |
|  eye |   10 |  060 |
|  bit |   10 |  070 |

Table3
| Col1 | Col2 | Col3 |
|  100 |   00 |  010 |
|  100 |   10 |  020 |
|  200 |   00 |  030 |
|  300 |   00 |  040 |
|  400 |   00 |  060 |

The third table shows the result I am looking for - the table only contains unique combinations of Col1 + Col2 and also contains a Col3 values, preferably one that provides a value combination with Col2 in the second table (ie. the last entry, 400, 00, 060).

I hope this provides a little more clarity.

Eric
  • 1,321
  • 2
  • 15
  • 30

2 Answers2

1

Maybe this way?

SELECT a.Col1, a.Col2, Max(a.Col3)
FROM Table1 a
LEFT JOIN Table2 b ON b.Col3 = a.Col3 AND b.Col2 = a.Col2
GROUP BY a.Col1, a.Col2
MundoPeter
  • 704
  • 6
  • 12
0

"The first problem is that Col3 can't simply be added as part of the GROUP BY as there can be different values for it, which causes duplicate combinations of Col1 + Col2 to be returned."

You can put the queries for different values into a select union subquery as a derived table, and group by on the derived table.