1

How to count the frequency of numbers given in a text file. The text file is as follows.

     0
     2
     0
     1
     0
     1
     55
     100
     100

I want the output as follows

     0   3
     1   2
     2   1
     55  1
     100 2

I tried this without success

     def histogram( A, flAsList=False ):
         """Return histogram of values in array A."""
         H = {}
         for val in A:
             H[val] = H.get(val,0) + 1
         if flAsList:
             return H.items()
         return H

Any better way. Thanks in advance!

user2176228
  • 327
  • 3
  • 10
  • possible duplicate of [Counting (and writing) word frequencies for each line within text file](http://stackoverflow.com/questions/5595574/counting-and-writing-word-frequencies-for-each-line-within-text-file) – P̲̳x͓L̳ Sep 14 '13 at 17:23

4 Answers4

4

Use Counter. It's the best way for this type of problems

from collections import Counter
with open('file.txt', 'r') as fd:
    lines = fd.read().split()
    counter = Counter(lines)
    # sorts items
    items = sorted(counter.items(), key=lambda x: int(x[0]))
    # prints desired output
    for k, repetitions in items:
        print k,'\t', repetitions

The output:

0   3
1   2
2   1
55  1
100 2
moliware
  • 10,160
  • 3
  • 37
  • 47
1

Use a Counter object for this:

from collections import Counter
c = Counter(A)

Now the c variable will hold a frequency map of each of the values. For instance:

Counter(['a', 'b', 'c', 'a', 'c', 'a'])
=> Counter({'a': 3, 'c': 2, 'b': 1})
Óscar López
  • 232,561
  • 37
  • 312
  • 386
1

Please consider using update:

def histogram( A, flAsList=False ):
 """Return histogram of values in array A."""
 H = {}
 for val in A:
     # H[val] = H.get(val,0) + 1
     if H.has_key(val):
        H[val] = H[val] + 1
     else:
        H.update({val : 1})
 if flAsList:
     return H.items()
 return H
Eric Gopak
  • 1,663
  • 1
  • 13
  • 26
1

Simple approach using a dictionary:

histogram = {}

with open("file","r") as f:
    for line in f:
        try:
            histogram[line.strip()] +=1
        except KeyError:
            histogram[line.strip()] = 1

for key in sorted(histogram.keys(),key=int):
    print key,"\t",histogram[key]

Output:

0       3
1       2
2       1
55      1
100     2

Edit:

To select a specific column you'd want to split the line using split(). For example the sixth field by splitting on a single space:

try:
    histogram[line.strip().split(' ')[5]] +=1
except KeyError:
    histogram[line.strip().split(' ')[5]] = 1
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
  • Thanks sudo_O, good one because results are in increasing order. What if I need this result for a specific column in a file? Suppose numbers given in column six? – user2176228 Sep 14 '13 at 17:40