3

Could you please advise what would be the best way to create a union column for 12 separate columns (located in 12 different Excel sheets within a workbook) with or without VBA?

There are good manuals how to do it for two columns without VBA (using MATCH function) however I am not sure how to approach the case with multiple columns.

pnuts
  • 58,317
  • 11
  • 87
  • 139
Stan
  • 41
  • 1
  • 1
  • 2
  • Can you give an example of the source data and what you want the result to be? – nekomatic Jul 29 '15 at 09:36
  • I have 12 sheets, each contains number IDs of projects in a column. I would like to have a separate sheet which would create a column with the union of all the individual project ID columns from the separate sheets (possibly in ascending order). Thank you! – Stan Jul 29 '15 at 09:43
  • The [Union method](https://msdn.microsoft.com/en-us/library/office/aa213609(v=office.11).aspx) isn't going to aggregate different ranges from different worksheets. You need to loop through the worksheets and collect the column data to a centralized location then de-dupe and sort. –  Jul 29 '15 at 10:06
  • I think your options are either a) to write a VBA macro that copies and pastes the appropriate number of cells from each source sheet to a single column on the destination sheet, then sorts and removes duplicates from that column, or b) to define each source range as an Excel table, then use MS Query via the Get External Data command and hand-write an SQL query using `SELECT DISTINCT`, `UNION` and `ORDER BY` to return the data you want. – nekomatic Jul 29 '15 at 10:51

3 Answers3

1

I think can be achieved with multiple consolidation ranges for a PivotTable. Would need labels for the columns and more than one column per sheet (could clone the existing ones). Should sort and remove duplicates from the list automatically (if cloned).

Community
  • 1
  • 1
pnuts
  • 58,317
  • 11
  • 87
  • 139
1

EDIT:

I'll assume your IDs are all numeric (otherwise, sorting would be very tricky if not impossible without VBA). You could modify the following array formula to meet your needs (select an area with enough rows to hold the full stack of IDs, enter the formula, then commit the formula with ctrl+shift+enter):

=SMALL(IFERROR(CHOOSE(COLUMN(INDIRECT("C1:C12",FALSE)),Sheet1!A1:A73,Sheet2!A1:A70,Sheet3!A1:A79,Sheet4!A1:A58,Sheet5!A1:A51,Sheet6!A1:A94,Sheet7!A1:A50,Sheet8!A1:A89,Sheet9!A1:A75,Sheet10!A1:A89,Sheet11!A1:A70,Sheet12!A1:A94),FALSE),ROW(INDIRECT("1:"&COUNT(Sheet1!A1:A73,Sheet2!A1:A70,Sheet3!A1:A79,Sheet4!A1:A58,Sheet5!A1:A51,Sheet6!A1:A94,Sheet7!A1:A50,Sheet8!A1:A89,Sheet9!A1:A75,Sheet10!A1:A89,Sheet11!A1:A70,Sheet12!A1:A94))))

I'll use a smaller version (2 columns) to explain how it works:

=SMALL(IFERROR(CHOOSE(COLUMN(A1:B1),A1:A73,C1:C70),FALSE),ROW(1:143))

First, COLUMN(A1:B1) returns a horizontal array of integers between 1 and 2. Passing this to the CHOOSE function with the two single-column ranges creates a single 73 x 2 array from both A1:A73 and C1:C70 (instead of creating a jagged array, the last three values of the second column will be filled in with #NA).

Wrap the result with IFERROR to convert the three #NA values to FALSE (otherwise, SMALL will return an error).

Next, ROW(1:143) returns a vertical array of integers between 1 and 143. Passing the 73 x 2 array and the array of integers between 1 and 143 to SMALL will return a single 143 x 1 array (vertical) of the sorted values (the three FALSE values are ignored).

Note on INDIRECT: Using INDIRECT in this way makes the formula stable even if rows/columns are deleted; however, it also makes the formula volatile, which will cause it to be recalculated every time there is a change in the workbook, which could slow things down considerably. Another option is INDEX (e.g., ROW(A1:INDEX(A:A,COUNT(...))), which can be affected by row/column deletions, but isn't volatile.

jblood94
  • 10,340
  • 1
  • 10
  • 15
  • This part doesn't seem to work: "Passing this to the CHOOSE function with the two single-column ranges creates a single 73 x 2 array". It doesn't seem to be the case and It is creating a horizontal array of two elements, which corresponds to only the first row of the would be 73x2 array. – Maziar Rezaei Aug 09 '21 at 11:38
  • @MaziarRezaei You are correct. There were a couple errors in the answer. I fixed the errors and simplified the formulas a bit. – jblood94 Aug 09 '21 at 12:30
  • Thanks a lot! I did fix the issue with Column(xxx) call, but was using google sheets and I just tried in Excel. It seems to be a bug in Google Sheets. in Excel, it worked as expected. – Maziar Rezaei Aug 09 '21 at 21:18
  • In Google Sheets, the equivalent of the formula to create the 73x2 array is to use curly brace syntax. Here it would be: ={A1:A73,C1:C70} – Maziar Rezaei Aug 09 '21 at 21:42
0

if you don't mind a bit of manual effort, this works for numeric and non numeric IDs:

  1. Stack columns on top of each other manually using Ctrl-C + Ctrl-V
  2. Go to Data tab --> Filter --> Advanced Filter --> tick unique records only --> choose your copy to location

This simple two step process would then give you unique union of two columns. Obviously the higher the number of columns, the more the utility of a VBA approach.

Oink
  • 1
  • Gotta watch out for those 7 year old questions. :) Hopefully Stan has sorted the problem by now - hasn't been seen on the site for the last six years. – Darren Bartrup-Cook Jan 18 '22 at 14:22