0

I tried to use BitSet(java) to find the common numbers in two arrays. (It seems works very well on finding repeating chars), however, when I tried the corner case such as Integer.MAX_VALUE (it cannot show up in the res) and Integer.MIN_VALUE( it shows IndexOutOfBoundsException("bitIndex < 0: " + bitIndex)) I thought the BitSet size is auto expandable. Anyone can figure it out? Thanks. BitSet is so handy. :)

public static List<Integer> common(List<Integer> A, List<Integer> B) {
    List<Integer> res = new ArrayList<Integer>();
    BitSet bitSetA = new BitSet();
    BitSet bitSetB = new BitSet();
    for (Integer x : A) {
      bitSetA.set(x);
    }
    for (Integer x : B) {
      bitSetB.set(x);
    }
    bitSetA.and(bitSetB);
    for (int i = 0; i < bitSetA.size(); i++) {
      if (bitSetA.get(i)) {
          res.add(i);
      }
    }
    return res;
}

public static void main(String[] args) {
    List<Integer> A = new ArrayList<Integer>();
    A.add(1);A.add(2);A.add(Integer.MIN_VALUE);
    List<Integer> B = new ArrayList<Integer>();
    B.add(Integer.MIN_VALUE);B.add(4);B.add(4);
    List<Integer> res = new ArrayList<Integer>();
    res = common(A,B);
    System.out.println(res);
}

}

Clark
  • 1
  • 2

1 Answers1

1

BitSet indexes must not be negative; see the third sentence of the javadoc. Integer.MIN_VALUE is negative and thus is not a valid index.

Integer.MAX_VALUE works for me, as long as:

  • there is sufficient heap space available, which is not the case by default in a 32-bit JVM, at least not the Oracle 32-bit JVMs I have conveniently to hand. A little experiment finds -Xmx of about 400m per maximal BitSet is enough. (I'd bet the actual usage is 256m, but -Xmx is a crude tool that includes several heap spaces and some overhead.)

  • you don't use length() or size() (or toString()) which malfunction at the maximal size. If I loop naively (as your code does) up to Integer.MAX_VALUE it works but takes about a minute; the nextSetBit approach shown in the javadoc is hugely faster.

dave_thompson_085
  • 34,712
  • 6
  • 50
  • 70