1

I am trying to find the number of Strings that only appear exactly once in an ArrayList.

How many I achieve this (preferably with the best possible time complexity)?

Below is my method:

  public static int countNonRepeats(WordStream words) {

    ArrayList<String> list = new ArrayList<String>();
    for (String i : words) {
      list.add(i);
    }

    Collections.sort(list);

    for (int i = 1; i < list.size(); i++) {
      if (list.get(i).equals(list.get(i - 1))) {
        list.remove(list.get(i));
        list.remove(list.get(i - 1));
      }
    }

    System.out.println(list);

    return list.size();
  }

Why doesn't it remove the String at list.get(i) and list.get(i-1)?

Maljam
  • 6,244
  • 3
  • 17
  • 30
Iona
  • 169
  • 1
  • 4
  • 12
  • 1
    @FedericoPeraltaSchaffner do you have any ideas? :) – Iona Mar 14 '16 at 19:14
  • 1
    you would get concurrent modification exception for the above method. – Rahul Sharma Mar 14 '16 at 19:19
  • @RahulSharma Not quite, concurrent modification exception is thrown from Iterator which is created mostly via for-each, not simple `for(i..)`. But we may risk getting error because of wrong indexes. – Pshemo Mar 14 '16 at 19:20
  • Sorry, It doesn't. But I tried to run the code and it works like a charm. @Iona do you get any errors? – Rahul Sharma Mar 14 '16 at 19:29
  • If you use `Collections.sort` the time complexity is `O(n log n)`, but you can easily find the number of strings that appear exactly once in `O(n)` time using `HashSet`s (but no sorting). – Paul Boddington Mar 14 '16 at 19:31

4 Answers4

4

There is no need for sorting. A better approach would be to use two HashSet One for maintaining repeating and one for non-repeating words. Since HashSet internally uses HashMap, Ideally contains, get, put operation has o(1) complexity. So the overall complexity of this approach would be o(n).

    public static int countNonRepeats(List<String> words) {

    Set<String> nonRepeating = new HashSet<String>();
    Set<String> repeating = new HashSet<String>();


    for (String i : words) {
        if(!repeating.contains(i)) {
            if(nonRepeating.contains(i)){
                repeating.add(i);
                nonRepeating.remove(i);
            }else {
                nonRepeating.add(i);
            }
        }
    }

    System.out.println(nonRepeating.size());

    return nonRepeating.size();
}
Sumit Rathi
  • 693
  • 8
  • 15
2

Here's one simple suggestion:

  1. First, sort your array by alphanumerical order
  2. Iterate through with a loop, if( !list.get(i).equals(list.get(i+1)) ) → unique
  3. If you find duplicates, increment i until you reach a different string

This will have the complexity of the sorting algorithm, since step 2+3 should be O(n)

Maljam
  • 6,244
  • 3
  • 17
  • 30
  • that won't work because e.g. [apple, apple, banana, banana] will return 1 instead of 0 because apple != banana – Iona Mar 14 '16 at 19:25
  • No because that's what step 3 avoids! read more carefully. In this example, `"apple"` is a duplicate at `i = 0`, so you iterate until `!list.get(i).equals("apple")` → `i = 2` in the next cycle – Maljam Mar 14 '16 at 19:26
  • ok you are correct! thank you sir! :) I'll upvote in 10 seconds – Iona Mar 14 '16 at 19:28
  • @Iona *How many I achieve this (preferably with the best possible time complexity)* Actually you don't even have to iterate it again. If you are using a `HashSet` then you can do it while scanning the words itself. – user2004685 Mar 14 '16 at 19:39
  • No need of sorting , please see my answer for better time complexity. – Sumit Rathi Mar 14 '16 at 19:52
  • @SumitRathi you're not sorting, but you're calling `contains` (which iterates through a loop internally) in every cycle, which is exactly what you're avoiding by sorting. Your algorithm has `O(n*n)`... – Maljam Mar 14 '16 at 19:54
  • @Malijam, since it will be a set of String, hash code of each String ideally, would be different, which in turn would result in a different hash, which in turn would result in putting each string in a different index of the array of linked list(internal data structure used by HashMap). So Time complexity would **not** be o(n*n) – Sumit Rathi Mar 14 '16 at 20:03
2

Is there any specific need of using an ArrayList? You can do it easily by using a HashSet.

Here is the code snippet:

public static void main (String[] args) {
    String[] words = {"foo","bar","foo","fo","of","bar","of","ba","of","ab"};
    Set<String> set = new HashSet<>();
    Set<String> common = new HashSet<>();
    for (String i : words) {
        if(!set.add(i)) {
            common.add(i);
        }
    }

    System.out.println(set.size() - common.size());
}

Output:

3

Here is the modified code:

public static int countNonRepeats(WordStream words) {
    Set<String> set = new HashSet<>();
    Set<String> common = new HashSet<>();
    for (String i : words) {
        if(!set.add(i)) {
            common.add(i);
        }
    }

    return (set.size() - common.size());
}
user2004685
  • 9,548
  • 5
  • 37
  • 54
0

You can use hashmap to achieve this.With this approach we can count the occurrence of all the words,
If we are interested in only unique words then access the element having count = 1.
HashMap<String,Integer> - key represents the String from arraylist and Integer represents the count of occurrence.

        ArrayList<String> list = new ArrayList<String>();
        HashMap<String, Integer> hashMap = new HashMap<String, Integer>();

        for (int i = 0; i < list.size(); i++) {

            String key = list.get(i);

            if (hashMap.get(key) != null) {
                int value = hashMap.get(key);
                value++;
                hashMap.put(key, value);
            } else {
                    hashMap.put(key, 1);
            }

        }
        int uniqueCount = 0;
        Iterator it = hashMap.entrySet().iterator();
        while (it.hasNext()) {
            Map.Entry pair = (Map.Entry) it.next();
            if ((int) pair.getValue() == 1)
                uniqueCount++;
        }
        System.out.println(uniqueCount);
Narendra Jaggi
  • 1,297
  • 11
  • 33