3

I have an Array of Strings and want to count the occurrences of any single String.

I have already sorted it. (It's a long Array and I wanted to get rid of the O(n²)-loop)

Here my code.. obviously it runs out in an ind.outOfB. exc.. the reason is clear but I donno how to solve..

for (int i = 0; i < patternsTest.length-1; i++) {
        int occ=1;
        String temp=patternsTest[i];
        while(temp.equals(patternsTest[i+1])){
            i++;
            occ++;
        }
    }
Cœur
  • 37,241
  • 25
  • 195
  • 267
Jan S
  • 117
  • 3
  • 14
  • Why not use `Map`? – Franklin May 24 '13 at 23:24
  • I need the raw counts.. i dont know if i would create a Map only for that... – Jan S May 24 '13 at 23:28
  • Why wouldn't you? It'd be faster, and easier to modify in the future. – Franklin May 24 '13 at 23:29
  • Do you not want to use a Map for efficiency? by sorting you are losing out on a lot of efficiecy to start with, using a map means you don't need to presort. But if you really don't want to use a map just explain the reasons, or I guess just say so :) – greedybuddha May 24 '13 at 23:30

6 Answers6

11

This would be a good place for a HashMap, the key would be the Word, and the value the Number of times it occurs. The Map.containsKey and Map.get methods are constant time lookups which are very fast.

Map<String,Integer> map = new HashMap<String,Integer>();
for (int i = 0; i < patternsTest.length; i++) {
    String word=patternsTest[i];
    if (!map.containsKey(word)){
        map.put(word,1);
    } else {
        map.put(word, map.get(word) +1);
    }
}

As a side benefit you don't even need to sort beforehand!

greedybuddha
  • 7,488
  • 3
  • 36
  • 50
  • This is a good answer, however I would change the declaration to the parent class Map map = new HashMap(); http://www.amazon.com/Effective-Java-2nd-Joshua-Bloch/dp/0321356683 – spuder May 24 '13 at 23:39
  • but do the .containsKey() iterate over the whole Map? and the existing entry will be overwritten everytime..? seems to be inefficient this way.. Not saying that it is a bad aproach.. – Jan S May 25 '13 at 00:04
  • The containsKey is O(1) search. Which means it will not iterate over the entire map, it's more akin to indexing into an array than a full search. I'll update the answer with this as well. – greedybuddha May 25 '13 at 00:06
  • How efficient are you looking for? You can make this slightly faster with an extra line of code if you would like me to put that version up instead of this – greedybuddha May 25 '13 at 00:12
  • Nice question :P .. everybody was looking for the fastest, right? ;) its enough so far.. I think a workin version of my approach would been slower.. with all the sorting and bla. Thank you very much :) – Jan S May 25 '13 at 00:46
  • From your last answer (that you deleted so I can't comment on it): IMO the question can be open and answered but not by giving direct code or by straightly giving the solution, instead driving OP to understand how to solve this. Please refer to [How do I ask and answer homework questions?](http://meta.stackexchange.com/q/10811/182862). – Luiggi Mendoza May 31 '13 at 14:41
  • He had already shown some effort and code, which made it much less offensive in my eyes. giving someone an algorithm is a fair hint imo, but in this case the algorithm has so many solutions its trivial to find, so I just added it in. Ultimately I agree with you though, so I deleted the answer. It's just the morning here and sometimes I just go on autopilot ;) nice link though, I'll be using that to in the future to comment on questions like this one – greedybuddha May 31 '13 at 14:47
4

You can use Java HashMap:

Map<String, Integer> occurrenceOfStrings = new HashMap<String, Integer>();

for(String str: patternsTest)
{
    Integer currentValue = occurrenceOfStrings.get(str);
    if(currentValue == null)
        occurrenceOfStrings.put(str, 1);
    else
        occurrenceOfStrings.put(str, currentValue + 1);
}
taocp
  • 23,276
  • 10
  • 49
  • 62
0

This does not have index out of bounds:

String[] patternsTest = {"a", "b"};
for (int i = 0; i < patternsTest.length-1; i++) {
    int occ=1;
    String temp=patternsTest[i];
    while(temp.equals(patternsTest[i+1])){
        i++;
        occ++;
    }
}

You can cause an Index Out of Bounds by changing the data to:

String[] patternsTest = {"a", "a"};
ashatch
  • 310
  • 1
  • 10
0

you could try a map and only one loop

Map<String, Integer> occurences = new HashMap<String, Integer>();
String currentString = patternsTest[0];
Integer count = 1;

for (int i = 1; i < patternsTest.length; i++) {
    if(currentString.equals(patternsTest[i]) {
        count++;
    } else {
        occurrences.put(currentString, count);
        currentString = patternsTest[i];
        count = 1;
    }
}
occurrences.put(currentString, count);
Marco Forberg
  • 2,634
  • 5
  • 22
  • 33
0

Guava Multiset solution (two lines of code):

Multiset<String> multiset = HashMultiset.create();
multiset.addAll(Arrays.asList(patternsTest));

//Then you could do...
multiset.count("hello");//Return count the number of occurrences of "hello".

We could use it both sorted and un-sorted arrays. Easy to maintain code.

卢声远 Shengyuan Lu
  • 31,208
  • 22
  • 85
  • 130
0

My solution is:

public int cantOccurences(String pattern, String[] values){
  int count = 0;

  for (String s : values) {
    count +=  (s.replaceAll("[^".concat(pattern).concat("]"), "").length());
  }
return count;
}
Gere
  • 2,114
  • 24
  • 24