1

I recognise that sounds a bit mad but to explain what I mean:

I have a Collection (eg HashSet) containing several quite slow initialisation objects and I want to see if the Collection already contains a particular object. Let's use Vector3d as an example (I know that is not expensive to initialise).

So the Collection contains:

Vector3d(1,1,1)
Vector3d(2,1,1)
Vector3d(3,1,1)

And I want to ask the Collection the question "does the Collection contain a Vector3d with x=2, y=1 and z=1 (i.e. I already know the data the .contains() method would hash against). So I could create a new Vector3d(2,1,1) and then use .contains() on that but as I said the objects initialisation is slow, or I could run through the entire Collection manually checking (which is what I'm doing now) but thats (as I understand it) slower than .contains() since it doesn't use hash. Is there a better way to do this?

The objects in question are mutable but the data that the equals method is based upon is not. (In my case they are blocks at x,y,z co-ordinates, the contents of the blocks may change but the x,y,z co-ordinates will not)

Richard Tingle
  • 16,906
  • 5
  • 52
  • 77
  • From some of the answers below: objects will change but NOT the data that equals will check against and i was just using arraylist as an example of a collection, array order its not important – Richard Tingle Apr 21 '13 at 08:11

6 Answers6

2

Using the .contains() method on an ArrayList will result in the equals method being invoked against each and every instance in the ArrayList.

While that will work for you, it may not prove beneficial for extremely large ArrayLists. If performance is a problem, you may wish to hold a HashSet containing references to the Vector3d objects. Invoking contains on a HashSet (or any Set) is drastically faster.

Isaac
  • 16,458
  • 5
  • 57
  • 81
2

ArrayList is the correct data structure if you only need to iterate through all of your elements or access your elements by position. It is the wrong data structure for anything else.

What you are trying to do is answer the containment question quickly, which is what Sets and Maps are for. It would make much more sense to create a separate, cheaper Vector3dKey class with the simple hash function you want and insert your expensive objects into a Map< Vector3dKey, Vector3d > at the same time as, or instead of, an ArrayList< Vector3d >. Java obviously won't keep two copies of your expensive vectors, just copies of the references. Of course, this whole scheme breaks down if your Vectors are mutable.

Judge Mental
  • 5,209
  • 17
  • 22
  • I really like this solution, its clear whats going on and how everything works. For completeness I have included my own code illustrating how it could be used – Richard Tingle Apr 21 '13 at 14:22
1

If you REALLY have to use a list (and not a hash) you might as well iterate over the list, retrieve each object and check it's attributes manually--I mean that will be pretty much as quick as "Contains".

If you were going to use a hash instead of a list then you should use a different object for comparison. For instance, if you use a HashMap with your above example your keys could be the following strings:

"1,1,1","2,1,1","3,1,1"

This would make a lookup instant and easy. If the list could contain other types of objects, maybe "Vector3d(1,1,1)" would be a better string. It's easy to re-create without being expensive or adding code complexity.

If you were using a list because you needed to retain order, look at LinkedHashMap.

Also I suggest you create a function to derive the string from the object (when inserting) or from the parameters (when searching) rather than distributing the functionality around your code, this is the kind of thing you are likely to need to change or expand on later.

Bill K
  • 62,186
  • 18
  • 105
  • 157
1

Code based on Judge Mental's answer

package mygame;

import java.util.HashMap;
import java.util.Map;


public class Main{


    public Main(){
        Map<CheapKey,ExpensiveClass> map=new HashMap< CheapKey, ExpensiveClass>();

        for(int i=0;i<100;i++){
            ExpensiveClass newExpensiveClass;
            newExpensiveClass=new ExpensiveClass(i,0,0);
            map.put(newExpensiveClass.getKey(), newExpensiveClass);
        }

        CheapKey testKey1=new CheapKey(1,0,0);
        CheapKey testKey2=new CheapKey(1,0,1);

        System.out.println(map.containsKey(testKey1)); //there is an object under key1
        System.out.println(map.containsKey(testKey2)); //there isn't an object under key2
        ExpensiveClass  retrievedExpensiveClass=map.get(testKey1);
    }

    public static void main(String[] args) {
        Main main=new Main();
    }

    protected class ExpensiveClass{
        int x;
        int y;
        int z;
        public ExpensiveClass(int x, int y, int z){
            this.x=x;
            this.y=y;
            this.z=z;
            for(int i=0;i<10000;i++){
                //slow initilisation
            }

        }
        public CheapKey getKey(){
            return new CheapKey(x,y,z);
        }

    }

    protected class CheapKey{
        int x;
        int y;
        int z;
        public CheapKey(int x, int y, int z){
            this.x=x;
            this.y=y;
            this.z=z;
        }

        @Override
        public boolean equals(Object obj) {
            if (obj == null) {
                return false;
            }
            if (getClass() != obj.getClass()) {
                return false;
            }
            final CheapKey other = (CheapKey) obj;
            return true;
        }

        @Override
        public int hashCode() {
            int hash = 7;
            hash = 79 * hash + this.x;
            hash = 79 * hash + this.y;
            hash = 79 * hash + this.z;
            return hash;
        }


    }



}
Richard Tingle
  • 16,906
  • 5
  • 52
  • 77
0

The contains method will invoke the .equals method of an object, so as long as the implementation of .equals for that class compares the values contains in the objects not their pointers then using contains will work.

http://docs.oracle.com/javase/7/docs/api/java/util/Collection.html#contains(java.lang.Object)

Edit misread your question a bit. I think it comes down to how big the list is vs how long the initialisation takes. If the list is short, iterate through it and manually check. However if the list is likely to be long, creating the objects and using .contains could well be more efficient.

eldris
  • 205
  • 1
  • 5
  • So you're suggesting creating some fake object class with the same fields as real object class and the same hash function and an equals function that accepts either. Interesting, I shall investigate – Richard Tingle Apr 20 '13 at 22:10
  • No, I wasn't suggesting creating fake objects, I was suggesting creating real ones, but then I reread your question and realised that wasn't a sensible suggestion at all. You could create fake ones by extending the classes and overriding their constructors and hash/equals methods. But as Louis said below it's not great design. You could potentially use Mockito, but again but design since that's internded to be used in tests. – eldris Apr 20 '13 at 22:26
0

ArrayList.contains doesn't use hashing; it's exactly the same speed as the manual check. It makes no difference either way.

Using a fake object class is doable, but almost certainly a code smell.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413