4

Objective: I want to be able to remove the common/ duplicate columns, from the output of a JOIN between 2 tables.

Problem: Most of the other questions that I have checked here on this subject, talk about explicitly mentioning the column names after select

For Instance: SQL Server : removing duplicate column while joining tables

The tables that I am dealing with usually have 20+ columns each, and it is not optimal to manually write each column name separately.

So if TableA has A columns and TableB has B columns, with one common column between them, then the final number of non-suplicate columns from a JOIN would be A+B-1, where A, B > 20

Existing Query:

-- Following query includes the duplicate column in the result
SELECT *
FROM TABLEA A
INNER JOIN TABLEB B ON (A.ID = B.ID)
WHERE <SOME_CONDITION>

Other solutions that I have come across:

-- Following query omits the duplicate column in the result
SELECT A.ID, A.A1, A.A2, A.A3, A.A4, A.A5, A.A6, A.A7, A.A8, A.A9, A.A10,
    B.A1, B.B2, B.B3, B.B4, B.B5, B.B6, B.B7, B.B8, B.B9, B.B10
FROM TABLEA A
INNER JOIN TABLEB B ON (A.ID = B.ID)
WHERE <SOME_CONDITION>
Arsenic_33
  • 43
  • 1
  • 3
  • Your best option here is to just list out the column names explictly. You would need dynamic SQL to implement verbatim what you have in mind. – Tim Biegeleisen Nov 28 '19 at 13:15
  • This kind of problem is highly symptomatic of poor design. – Strawberry Nov 28 '19 at 13:20
  • @Strawberry: I felt the same prima facie. The DB architect that I spoke with says that these tables carry multiple columns to avoid more number of joins which in turn would increase the (query execution + data fetching) time for real time dashboards. – Arsenic_33 Nov 29 '19 at 08:45
  • I ain't buying it – Strawberry Nov 29 '19 at 08:50

2 Answers2

3

If the only duplicate columns are the JOIN columns, you can fix this with USING:

SELECT *
FROM TABLEA A INNER JOIN
     TABLEB B 
     USING (id)
WHERE <SOME_CONDITION>;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

If you don't want to write down all column names then you need to use dynamic SQL to select distinct column names from table schema and then execute your query string.

Example:

SELECT GROUP_CONCAT(DISTINCT(COLUMN_NAME) SEPARATOR ', ')
INTO   @ditinct_columns
FROM   information_schema.columns
WHERE  table_name IN ('A', 'B');

SET @q = CONCAT("SELECT ", @ditinct_columns, " FROM A AS a INNER JOIN B AS b ON <SOME_CONDITION>");

SELECT @q;

/* execute*/
PREPARE stmt FROM @q;
EXECUTE stmt;
DEALLOCATE PREPARE stmt
blackbishop
  • 30,945
  • 11
  • 55
  • 76
  • Able to understand this logically. GROUP_CONCAT is having trouble with Vertica (via DBeaver out of box). For instance, I've replaced information_schema.columns with v_catalog.columns – Arsenic_33 Nov 29 '19 at 08:34