0

I am using a nested IF statement within a Quartile wrapper, and it only kind of works, for the most part because it's returning values that are slightly off from what I would have expected if I calculate the range of values manually.

I've looked around but most of the posts and research is about designing the fomrula, I haven't come across anything compelling in terms of this odd behaviour I'm observing.

My formula (ctrl+shift enter as it's an array): =QUARTILE(IF(((F2:$F$10=$W$4)($Q$2:$Q$10=$W$3))($E$2:$E$10=W$2),IF($O$2:$O$10<>"",$O$2:$O$10)),1)

The full dataset: 0.868997877* 0.99480118 0.867040346* 0.914032128* 0.988150438 0.981207615* 0.986629288 0.984750004* 0.988983643*

*The formula has 3 AND conditions that need to be met and should return range: 0.868997877 0.867040346 0.914032128 0.981207615 0.984750004 0.988983643

At which 25% is calculated based on the range.

If I take the output from the formula, 25%-ile (QUARTILE,1) is 0.8803, but if I calculate it manually based on the data points right above, it comes out to 0.8685 and I can't see why.

I feel it's because the IF statements identifies slight off range but the values that meet the IF statements are different rows or something.

Youl
  • 13
  • 3
  • Sorry, there was a typo in my question, I meant to say 25% not 75% - the question still stands though: the two different approaches are returning different values – Youl May 25 '18 at 09:07
  • I don't know why this has been downvoted - it's actually a rather interesting question. Boils down to different methods of calculating quartiles - I will try and post something. – Tom Sharpe May 25 '18 at 10:10

1 Answers1

0

If you look at the table here you can see that there is more than one way of estimating quartile (or other percentile) from a sample and Excel has two. The one you are doing by hand must be like Quartile.exc and the one you are using in the formula is like Quartile.inc

Basically both formulas work out the rank of the quartile value. If it isn't an integer it interpolates (e.g. if it was 1.5, that means the quartile lies half way between the first and second numbers in ascending order). You might think that there wouldn't be much difference, but for small samples there is a massive difference:

Quartile.exc Rank=(N+1)/4

Quartile.inc Rank=(N+3)/4

Here's how it would look with your data

enter image description here

Tom Sharpe
  • 30,727
  • 5
  • 24
  • 37
  • Thanks so much, Tom - that indeed looks like the root of the output behavior I see! Luckily for me I just need to ensure I am being consistent so I think in this case I will use Quartile.inc in both calcs. Thank so much for this education, extremely helpful – Youl May 25 '18 at 11:22
  • Happy to help. I can't advise you on which quartile to use, but sounds like it doesn't matter for you as long as it's consistent, and as far as I can see there is no 'wrong' way of doing it. – Tom Sharpe May 25 '18 at 13:36