Merging two tables while deleting duplicates in the first table and

Question

This is my first post here and your help would be greatly appreciated! I've read a lot of other posts on this site, however I cannot find the answer to my specific question. I tried using VLOOKUP, INDEX, MATCH, Pivottables, etc. However it doesn't work out the way I want.

Background information: for my thesis I'm studying the difference in cost of capital among single segment and multi segment firms. My dataset contains two tables, one with the SIC codes of the firm's segments and one with the corresponding sales of that segment. The issue with these tables is that there are duplicate SIC codes in the first table. I want to remove the duplicates from the first table and simultaneously calculate the sum of the sales of these duplicate SIC codes in the second table.

My data looks as follows:

The SIC codes per segment and the sales per segment: Input of SIC codes and sales per segment (one company for 20 years)

What I want to do is eliminate the duplicate SIC codes. If I change the table with SIC codes I also need the table with sales to change accordingly. However, the sales of duplicate segments should not be deleted but added to the first duplicate segment. I can computed this manually for one company, however for 1800 companies would this would be very time consuming. The manually computed output for the SIC codes and for Sales looks like this (so I don't need to merge the table, the output is still in two different tables): Required output for the SIC codes table and Sales table (one company for 20 years

Thanks a lot!

Perhaps my question is easier to understand if you can look at the data file. You can find it here: http://www.filedropper.com/examplemergingdatatables (scanned for viruses) — laudo, Feb 17 '16 at 11:44
I'm struggling to understand what you need to do with the SIC code table - what specifically are you doing with the SIC table input data to get to the output data? Your output SIC table has duplicate values in it, so you've not removed the duplicates as your question states you require... — Simon Wray, Feb 17 '16 at 12:53
You may want to structure your data as proper Excel Tables (http://www.contextures.com/xlExcelTable01.html), which often helps with forumlae. And you may want to look at using the COUNTIFS & SUMIFS functions: http://stackoverflow.com/questions/1880168/excel-structured-reference-table-syntax/11065490#11065490 — Simon Wray, Feb 17 '16 at 12:59

M.L · Answer 1 · 2016-02-17T13:59:02.277

This is the way to go about it. In 2 parts: get a list of unique SIC codes in each row, and, sum up the unique corresponding values to them.

Part I: General logic to this part: nested INDEX functions.

Let's assume your output table (range including headers A25:J45) is exactly below your main/input table (range including headers A1:J21). Assuming only 20 rows of data, but you can drag the formulae to as many rows.

The first column should always be picking up values from the corresponding first column of input table. A26 =A2 and so on.

For B26, use this formula =INDEX($A2:$J2,MATCH(0,INDEX(COUNTIF($A26:A26,$A2:$J2),0,0),0))

You can drag this formula across to J45/end of output table. (You can use "Evaluate Formula" in excel to understand the workings of the logic) This should populate your output table with unique SIC codes for each row.

Part II: General logic to this part: SUM and Arrays

Let's assume your output table (range including headers L25:U45) is exactly below your main/input table (range including headers L1:U21). Assuming only 20 rows of data, but you can drag the formulae to as many rows.

In cell L26, which is the first row, left most element in the output table, you will need an array formula (Ctrl+Shift+Enter). If you don't know what array formula are read here.

Formula for L26 {=SUM(IFERROR($A2:$J2=A26,0)*IFERROR($L2:$U2,0))}.

You will need to enter only this in L26 =SUM(IFERROR($A2:$J2=A26,0)*IFERROR($L2:$U2,0)), and hit Ctrl+Shift+Enter. Excel will put the curly brackets around it on its own self. Copy paste the formula in the rest of the output table.

Screenshot for reference based on your Excel file.

score 0 · Answer 2 · answered Feb 17 '16 at 17:22

0

There is a remove duplicates function in excel 2010 on the Data Tab Screenshot

answered Feb 17 '16 at 17:22

BuddingProgrammer

51
6

Merging two tables while deleting duplicates in the first table and

2 Answers2