-3

Question image

Question details showed in the picture Thanks for your help.

Write a function histogram(values, dividers) that takes as argument a sequence of values and a sequence of bin dividers, and returns the histogram as a sequence of a suitable type (say, an array) with the counts in each bin. The number of bins is the number of dividers + 1; the first bin has no lower limit and the last bin has no upper limit. As in (a), elements that are equal to one of the dividers are counted in the bin below.

For example, suppose the sequence of values is the numbers 1,..,10 and the bin dividers are array(2, 5, 7); the histogram should be array(2, 3, 2, 3).

Here is my code

def histogram(values, dividers):
    count=0
    for element in values:
          index=0
          i=0
          count[i]=0
          while index < len(dividers) - 2:
              if element <= dividers[index]:
                  i=dividers[index]
                  count[i] += 1
                  index=len(dividers)
              elif element > dividers[index] and element <= dividers[index+1]:
                  i=dividers[index]
                  count[i] += 1
                  index= len(dividers)
              index += 1
      return count[i]
PoByBolek
  • 3,775
  • 3
  • 21
  • 22
Jayyyyyy
  • 197
  • 1
  • 10
  • 1
    What did you try? Please paste the question instead of sharing an image. – Shiva Sep 08 '19 at 08:18
  • 1
    Please do not use stackoverflow to solve your school/university tasks. Try to solve it, and if you have a more specific question, ask it here. – Snackoverflow Sep 08 '19 at 08:21
  • This is no university task. It is a lab question. I just want to know how to solve it. My question is I have no idea about how to add the number of count into a list. Because the number of bins is not fixed which depends on the input. – Jayyyyyy Sep 08 '19 at 08:33
  • not able to understand question :/ – Lalit Verma Sep 08 '19 at 08:41
  • Sorry, for example, the sequence of values is the numbers 1,..,10 and the bin dividers are array(2, 5, 7); the histogram should be array(2, 3, 2, 3). But the number of bins would change. If the (2,5,7) becomes (2,5,7,9) then we have 4 bins which are (0,2),(2,5),(5,7),(7,9) Then we get a output like [a,b,c,d]. And a b c d are the number of items that fall into these bins. To get this output we need to count the numbers of items that fall into these bins. – Jayyyyyy Sep 08 '19 at 08:50
  • But I do not know how to name different bins. What I mean is how to add 1 to different bins. Because when the number is in this bin the count should plus 1, but how to plus 1 for different bins and return a list like[a,b,c,d] – Jayyyyyy Sep 08 '19 at 08:50
  • how the output of given example is 2 3 2 3, bin ranges are (-ve,2) (2,5)(5,7)(7,+ve) – Lalit Verma Sep 08 '19 at 08:57

2 Answers2

1
from bisect import bisect_left 
# Using Python builtin to find where value is in dividers
 (this is O(log n) for each value)

def histogram(values, dividers):
  count = [0]*(1+len(dividers))
  for element in values:
    i = bisect_left(dividers, element)
    count[i] += 1
  return count

values = list(range(1, 11)) # list from 1 through 10
bins = [2, 5, 7]
c = histogram(values, bins) # Result [2, 3, 2, 3]

Explanation of histogram

1. bisect_left finds the bin the index the value should be inserted
2. We update count array according to this index.  Count array size is
(1+len(bins)), to allow for values > bins[-1]
DarrylG
  • 16,732
  • 2
  • 17
  • 23
  • and this would be one of the more performant datastructures :o) +1 - but it does not give any resulotion for how this would work ;) – Patrick Artner Sep 08 '19 at 09:09
0

A simple implementation would be to prepare a list of counters of size len(dividers)+1.

Go through all numbers provided:

  • if your current number is bigger then the largest bin-divider, increment the last bins counter
  • else go through all dividers until your number is no longer bigger as it, and increment that bin-counter by 1

This leads to:

def histogram(values, dividers):
    bins = [0 for _ in range(len(dividers)+1)]
    print (bins)
    for num in values:
        if num > dividers[-1]:
            bins[-1] += 1
        else:
            k = 0
            while num > dividers[k]:
                k+=1
            bins[k] += 1
    return bins


print(histogram(range(20),[2,4,9]))

Output:

# counts
[3, 2, 5, 10]

Explanation

Dividers: [2,4,9]
Bins:     [ 2 and less |  4  | 9 | 10 and more ]
Numbers:  0..19

0, 1, 2 -> not bigger then 9, smaller/equal 2
3, 4 -> not bigger then 9, smaller/equal 4
5, 6, 7, 8, 9 -> not bigger then 9, smaller/equal 9
10, 11, 12, 13, 14, 15, 16, 17, 18, 19 -> bigger 9

This is a naive implementation and there are faster ones using tree like data structures for more performance. Consider a divider of [5,6,7] and a list of [7,7,7,7,7,7] this would run 6 times (6*7) testing for bins 3 times (bigger then 5, bigger then 6, not bigger then 7) == 18 unrolled loops.

There are more efficient algos possible using better suited data structures.

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69