5

I have the following code to count the instances of different strings in an array;

String words[] = {"the","cat","in","the","hat"};
HashMap<String,Integer> wordCounts = new HashMap<String,Integer>(50,10);
for(String w : words) {
    Integer i = wordCounts.get(w);
    if(i == null) wordCounts.put(w, 1);
    else wordCounts.put(w, i + 1);
}

Is this a correct way of doing it? It seems a bit long-winded for a simple task. The HashMap result is useful to me because I will be indexing it by the string.

I am worried that the line

else wordCounts.put(w, i + 1);

could be inserting a second key-value pair due to the fact that

new Integer(i).equals(new Integer(i + 1));

would be false, so two Integers would end up under the same String key bucket, right? Or have I just over-thought myself into a corner?

lynks
  • 5,599
  • 6
  • 23
  • 42

6 Answers6

8

Your code will work - but it would be simpler to use HashMultiset from Guava.

// Note: prefer the below over "String words[]"
String[] words = {"the","cat","in","the","hat"};
Multiset<String> set = HashMultiset.create(Arrays.asList(words));

// Write out the counts...
for (Multiset.Entry<String> entry : set.entrySet()) {
    System.out.println(entry.getElement() + ": " + entry.getCount());
}
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Thanks Jon, I shall look into it. – lynks Oct 31 '12 at 17:36
  • that's great thanks. About the array syntax: I feel I'm fighting a losing battle standing by that one :P – lynks Oct 31 '12 at 17:41
  • @lynks: Why on earth would you *want* to split the type information into two parts? The type of the variable is "array of strings" - if this is an old C or C++ habit, it's worth throwing away conventions which don't make sense in a new language. – Jon Skeet Oct 31 '12 at 17:43
  • I completely understand, in fact deep down I know I'm wrong. I think I just don't like seeing the [] in one place, and then in another when indexing the array. And yeah it's an old C habit. – lynks Oct 31 '12 at 17:45
  • 1
    @lynks: Well they're doing very different things - one is indexing *into* an array, and one is saying that the type you're interested in *is* an array. Personally I really dislike the fact that Java even allows the "split" syntax. Euch. – Jon Skeet Oct 31 '12 at 17:46
6

Yes you are doing it correct way. HashMap replaces values if same key is provided.

From Java doc of HashMap#put

Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.

Amit Deshpande
  • 19,001
  • 4
  • 46
  • 72
  • Thanks, major brainfail moment. I have been working with various `Collections` all week. **Of course** `values.equals()` is irrelevant, its all about the keys! – lynks Oct 31 '12 at 17:34
2

Your code is perfectly fine. You map strings to integers. Nothing is duplicated.

Petar Minchev
  • 46,889
  • 11
  • 103
  • 119
2

HashMap don't allow duplicate keys, so there is no way to have more than one SAME key-value pairs in your map.

kosa
  • 65,990
  • 13
  • 130
  • 167
0

Here is a String-specific counter that should be genericized and have a sort by value option for toString(), but is an object-oriented wrapper to the problem, since I can't find anything similar:

package com.phogit.util;

import java.util.Map;
import java.util.HashMap;

import java.lang.StringBuilder;

public class HashCount {

    private final Map<String, Integer> map = new HashMap<>();

    public void add(String s) {
        if (s == null) {
            return;
        }
        Integer i = map.get(s);
        if (i == null) {
            map.put(s, 1);
        } else {
            map.put(s, i+1);
        }
    }

    public int getCount(String s) {
        if (s == null) {
            return -1;
        }
        Integer i = map.get(s);
        if (i == null) {
            return -1;
        }
        return i;
    }

    public String toString() {
        if (map.size() == 0) {
            return null;
        }
        StringBuilder sb = new StringBuilder();
        // sort by key for now
        Map<String, Integer> m = new TreeMap<String, Integer>(map);
        for (Map.Entry pair : m.entrySet()) {
            sb.append("\t")
              .append(pair.getKey())
              .append(": ")
              .append(pair.getValue())
              .append("\n");;
        }
        return sb.toString();
    }

    public void clear() {
        map.clear();
    }
}
Priot
  • 21
  • 6
0

Your code looks fine to me and there is no issue with it. Thanks to Java 8 features it can be simplified to:

String words[] = {"the","cat","in","the","hat"};
HashMap<String,Integer> wordCounts = new HashMap<String,Integer>(50,10);
for(String w : words) {
     wordCounts.merge(w, 1, (a, b) -> a + b);
}

the follwowing code

System.out.println("HASH MAP DUMP: " + wordCounts.toString());

would print out.

HASH MAP DUMP: {cat=1, hat=1, in=1, the=2}
Marc Schmid
  • 1,028
  • 7
  • 13