0

I have an array with N (positive) points. I would like to find M bin edges of a histogram such that all the bars have the same height. In other words I want to find M+1 points such that the count of array points between two consecutive bin edges is the same.

Example

>>> array = [0.3 0.3 0.3 0.7 0.8 0.9]
>>> M = 2
>>> binPartition(array, M)
[0, 0.5, 1]

I would appreciate an answer in python and numpy but a link to a known algorithm will suffice! Thank you! :)

Ant
  • 5,151
  • 2
  • 26
  • 43
  • A code writing request from a user with almost 3k rep? – Eugene Sh. Aug 14 '17 at 16:55
  • @EugeneSh. A link to an an algortithm would have been enough :) – Ant Aug 14 '17 at 16:57
  • 1
    You should know that that would be OT as well – Eugene Sh. Aug 14 '17 at 16:58
  • 1
    @EugeneSh. Why is that? I've seen thousands of questions like that. In this particular case I had a brain fart and did not realize I was essentially asking about the percentile function, but I did not know that when I asked the question :D – Ant Aug 14 '17 at 17:00
  • 1
    Number (4) here: https://stackoverflow.com/help/on-topic – Eugene Sh. Aug 14 '17 at 17:02
  • @EugeneSh. Exactly. I had a problem and asked for help. I did not seek recommendation for a book or an article or whatever. What was I supposed to do instead? If I had realized I was asked for the percentile, I would have looked it up on google and I wouldn't have asked the question. Also, there are thousands of questions like this. See here https://stackoverflow.com/questions/22354659/find-value-that-partitions-two-numpy-arrays-equally?rq=1 , here https://stackoverflow.com/questions/12863059/python-how-to-make-an-histogram-with-equally-sized-bins?rq=1 and basically everywhere else – Ant Aug 14 '17 at 17:06
  • 1
    I have pointed that this particular question is not a good fit with the current SO rules (it is actually violating at least two of them). The reason for it, the number of similar question and the fact you have actually got an answer are really irrelevant. – Eugene Sh. Aug 14 '17 at 17:10
  • 3
    @EugeneSh. I disagree that it is violating any rule. The real problem here is the level of the question. If I had wrote the exact same thing, describing a harder problem, nobody would have had an issue with it. I've seen it happen lots of times (especially on math.se where I am more active). In any case, you did the right thing to vote to close and downvote, if you disagree, but I honestly don't see the issue. – Ant Aug 14 '17 at 17:14

1 Answers1

1

Than can be done with percentile:

import numpy as np

def binPartition(array, M):
    return np.percentile(array, np.linspace(0, 100, M + 1))

binPartition([0.3, 0.3, 0.3, 0.7, 0.8, 0.9], 2)
>>> array([ 0.3,  0.5,  0.9])
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • Thank you. I must have had a brain fart, I was asking for the percentile function :D – Ant Aug 14 '17 at 17:01