3

In Java, I have:

Set<Integer> set = new HashSet<Integer>();
callVoidMethod(set);
...
public static void callVoidMethod(Set<Integer> set) {

    Set<Integer> superset = new HashSet<Integer>(set);
    ...
    // I just added this loop to show that I'm adding quite a lot
    // well, it depends on conditions, sometimes I add nothing,
    // but it is unpredictable and do not know if add something
    for (int i = 0; i < 1000; i++) {
         ...
         if (conditionSatisfied) superset.add(someValue);
         ...
    }

}

The code above is simplified, the idea is to pass the set by reference into a void method and create a full copy of a set such that we will be able to add some new elements to the copy (superset here) and do not touch the set as we need it untouched when we exit the void method.

My code works with lots of data processing and if there is no faster way to make a copy, then I would like to optimize the HashSet itself, for instance I do not need Integers as keys, but better primitive ints. Would be a good idea to implement an int[] array of keys in the MyHashSet?

If so is possible, I would be interested in using the same idea for improving this:

Map<Integer, ArrayList<Item>> map = new HashMap<Integer, ArrayList<Item>>();

EDIT: I need only speed-performance-optimization. I do not need beautiful-maintainable code and memory.

Sophie Sperner
  • 4,428
  • 8
  • 35
  • 55

3 Answers3

8

In general, if you're looking for high speed collections that allow for primitives, consider using Trove. I would say - don't optimize unless you've discovered that this is actually a bottleneck. You or someone else will need to maintain this code, and reading an optimized version is often harder.

Amir Afghani
  • 37,814
  • 16
  • 84
  • 124
  • In this specific case, [`TIntHashSet`](http://trove4j.sourceforge.net/javadocs/gnu/trove/set/hash/TIntHashSet.html) is probably the appropriate tool for the job. If you need it to be a `Set`, wrap that in a [`TIntSetDecorator`](http://trove4j.sourceforge.net/javadocs/gnu/trove/decorator/TIntSetDecorator.html). (But that said, I would under no circumstances whip up my own implementation here.) – Louis Wasserman Aug 06 '12 at 17:19
6

Have you tried first tweaking the HashSet's initial capacity and load factor?

HashSet

Here's a post that might help you.

HashMap initialization parameters

If you have such a big amount of data to process, it would probably payoff to analyze it's distribution and adjust these settings first.

Having tweaked that, it might give a very slight performance to replace Integers with ints, but it might depend more on JVM implementation specifics and hardware configuration than what this improvement alone would give you.

Community
  • 1
  • 1
Acapulco
  • 3,373
  • 8
  • 38
  • 51
5

What do you do with these objects later? If you're just doing lookups or something like that, it might be faster to keep them separate and check both, rather than making a full copy. So,

public static void callVoidMethod(Set<Integer> set) {

    Set<Integer> superset = new HashSet<Integer>();
    ...
    if (conditionSatisfied) superset.add(someValue);

    ...
    if(set.contains(value) || superset.contains(value))
        doSomething();

}
Carl
  • 905
  • 5
  • 9
  • Yeah, I was thinking about that, then I call this method recursively :) And thus I pass too many sets in this method, the idea was then to create an array of elements which are those separate HashSets, so for example: first time I call the method I pass an array with one set, then I create the second superset and add it to the array, then call again the method with two sets now... But I ended with the challenge of creating such an array of HashSets. – Sophie Sperner Aug 06 '12 at 17:24
  • 1
    Calling it recursively adds to the complexity for sure. Maybe this won't work for what you're ultimately trying to do, but you could make a defensive copy of the set before calling the method the first time, pass the copy and allow it to add to the set like you want. The original set would be unmodified, and all subsequent operations happen on the first copy? – Carl Aug 06 '12 at 17:39
  • I've finally implemented an array of `HashSet`s. So thank you for the suggestion. – Sophie Sperner Aug 06 '12 at 22:00