1

I have a dataset that contains admissions rates of all providers that we work with. I need to divide that data into quartiles, so that each provider can see where their rate lies in comparison to other providers. The rate ranges from 7% to 89%. can anyone suggest me how to do this? I am not sure if this is the right place to ask this question but if somebody can help me with this, I would really appreciate that.

The other concern is that if a provider's numbers is really small eg: 2/4 = 50%, the provider might fall into worse quartile but it doesn't mean that the provider's performance is bad because the numbers are so small. I hope this is making sense. Please let me know if I can clarify it further.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
datacentric
  • 135
  • 2
  • 6
  • 15

2 Answers2

0

First concern: For small n, do not use quartiles. Whether n is small is arbitrary.

Josh Lee
  • 171,072
  • 38
  • 269
  • 275
Jefftopia
  • 2,105
  • 1
  • 26
  • 45
0

There are ways to obtain quantiles without doing a complete sort but unless you've got huge amounts of data there is no point in implementing those algorithms if you haven't already got them available. Presuming you have a sort() function available, all you need to do is:

  1. Given n data points.
  2. Sort the data points.
  3. Find the n/4, n/2 and 3*n/4th points in the sorted data, which are your quartiles.

As you say, if n is less than some number (that you'll have to decide for yourself) you may want to say that the quartile result is "not applicable" or some such.

Simon
  • 10,679
  • 1
  • 30
  • 44
  • 1
    Sort data *ascending*, otherwise the interpretation doesn't make sense. – Jefftopia Jul 16 '13 at 20:54
  • Thank you all for your responses. I will try out your suggestions. @Jefftopia - If a small n should not be considered into quartiles, that where would a small n go then? – datacentric Jul 17 '13 at 14:26
  • I think the answer to your question depends on what you want to accomplish. In general, I would advise against using quartiles because they fail to best capture important information about distribution of numbers. For small `n`, I suppose I would stop at step 2) in @Simon's post. – Jefftopia Jul 17 '13 at 14:35