1

Okay I need some help before I pull my hair out :P

What I ultimately want to do is find the second vlookup value. The basic design is 1. Find value row number with a MATCH formula 2. Then use INDIRECT formula to shift the table array down one from the original VLOOKUP

Sounds simple. And I've gotten it to work perfectly fine in a single workbook, but my VLOOKUP needs to find values in a second workbook.

In my head the formula below should work but obviously it's not.

=VLOOKUP(A1,'[InsightsDataExport.xlsx]Lead Profile'!(INDIRECT("AP"&(((MATCH(A1,'[InsightsDataExport.xlsx]Lead Profile'!$AP:$AP,0))+1))):AR10000,2,FALSE)

So my though was

  1. Looks at Workbook 1 A1 cell
  2. Table array in Workbook 2 (aka InsightsDataExport)
  3. INDIRECT pairs column "AP" with the row number. The row number is found using a MATCH formula to find the first occurrence of A1 value form workbook 1 in workbook 2 AP columns then adds 1 (hence the table array begins AFTER the first occurrence)

I'm just going around and around in circles here. If someone could point me in the right direct that would be greatly appreciated.

Jerremy Leedham
  • 97
  • 4
  • 13

1 Answers1

2

Rather than trying to fix your formula, I'd rather show you a cleaner approach.

INDIRECT is a powerful but Volatile function that will always recalculate with every change in a workbook. If you're using it in an entire column, it can slow your workbook down immensely. You're also stuck searching an arbitrarily large area which has it's own risks and performance issues.

You can find the row number of the 2nd instance of a value using an array-entered function. Double click in a cell so you're in edit mode, paste in this formula and then confirm it using CRTL-SHIFT-ENTER:

=SMALL(IF('[InsightsDataExport.xlsx]Lead Profile'!$AP1:$AP10000=A1,ROW($A$1:$A$10000),FALSE),2)

The inside parts of this formula do the following:

'[InsightsDataExport.xlsx]Lead Profile'!$AP1:$AP10000=A1

creates an array of TRUE and FALSE values depending on whether each cell in AP1:AP10000 matches the value in the local A1

ROW($A$1:$A$10000)

creates an array of numbers from 1 to 10000 that are used if a cell matches A1, otherwise FALSE is used. So, if the value you're searching for is found in cells 2 and 5, the inside part of the formula has an array like:

{FALSE, 2, FALSE, FALSE, 5, FALSE....}

Then, SMALL(array,2) means to take the 2nd smallest numerical value, which will be the row number of the 2nd match, which in this case in 5.

To see it in slow motion, test it on a small range, then use the Evaluation Function tool from the Formulas tab.

Now that you have the row number, you can use an INDEX on the column you're actually searching for:

=INDEX('[InsightsDataExport.xlsx]Lead Profile'!$AR1:$AR10000,rownumber)

However, this is still operating over an arbitrarily large range to ensure it encompasses all values. This is poor practice for several reasons. Mainly, it never guarantees you include all data; in particular, if rows are being deleted the range will keep shrinking.

It's much tidier and easier to read if you work with tables instead.

If you insert a table onto your data, you can refer to column names instead. The functions will then search through only the number of rows that there are in the table.

On your Data, use Insert > Table. Then when you create formulas, if you select all data rows for a column in the table, excel will automatically insert the column name. So, if column AP has a heading called "ID", the formula to identify the row with the 2nd matching ID becomes:

=SMALL(IF(InsightsDataExport.xlsx!Table1[Id]=A1,ROW(InsightsDataExport.xlsx!Table1[Id]),FALSE),2)

You can then also use INDEX on the column that has the data you want to return. Just note that if you select only the data rows excluding the column heading, then row 2 is actually the first row of the index being searched, so you'll need to subtract 1 from the row number. If the data is in a column with heading "Amount", you'll use a formula like:

=INDEX(InsightsDataExport.xlsx!Table1[Amount],rownumber - 1)
Michael
  • 4,563
  • 2
  • 11
  • 25
  • Can't thank you enough! Not just for a solution but for taking the time to actually explain it :D – Jerremy Leedham Feb 01 '17 at 23:08
  • @michael - Great explanation, thank you! `INDIRECT is a powerful but Volatile function that will always recalculate with every change in a workbook. If you're using it in an entire column, it can slow your workbook down immensely.` ...I did not realize that... it would be nice if Excel would provide a warning when filling 10,000 cells with formulas that behave this way! – ashleedawg Nov 30 '17 at 05:32