5

I have a need for hashmap-like functionality in Matlab, where the hashmap maps vectors to other vectors, and the number of vectors (ranging in the hundreds of thousands) is not known beforehand.

I tried Matlab's inbuilt Containers.Map, but that doesn't accept vectors as keys. Then I tried java.util.HashMap:

>> map = java.util.HashMap;
>> map.put(1:3,zeros(2,1));
>> map.get(1:3)

 ans =

 []

So for some reason that doesn't seem to work, even though Java's HashMap should be able to map arrays to arrays.

The other option would be to keep two separate matrices, one for the keys and one for the values, and grow them incrementally. But I don't want really want to do that because of the pain in Matlab of growing things incrementally (even with block-size increments etc, e.g. here).

Questions: 1. Why doesn't Java's HashMap work here? 2. Any other approaches?

Thanks.

learnvst
  • 15,455
  • 16
  • 74
  • 121
Matt
  • 282
  • 3
  • 13
  • 2
    check out http://stackoverflow.com/questions/1352553/how-can-i-use-matlab-arrays-as-keys-to-the-hashmap-java-objects – Rasman Oct 09 '12 at 18:38
  • 1
    What's the range of values in the key vectors? If they're ints under 2^16, you could just convert them to `char` and use the resulting funny strings as keys. – Andrew Janke Oct 10 '12 at 07:10
  • Thanks all for your answers. The key vectors are indeed ints under 2^16 so I compared using containers.Map with char keys and java's HashMap with keys similar to the post Rasman linked to: ok it's too much code to post here so I'll post it as an answer. – Matt Oct 10 '12 at 11:25
  • On second thought, you could do this with any key vector values, not just ints < 2^16, by using `typecast` to stick arbitrary values' bit patterns inside chars, like `charKey = char(typecast(key, 'uint16'))`. – Andrew Janke Oct 11 '12 at 06:20
  • @Andrew: I missed your last comment. For some reason I don't get email notifications. Can't {typecast} result in clashes? I.e. two different numbers > 2^16 that give the same typecast? – Matt Oct 19 '12 at 12:00
  • @Matt: No clashes, not as long as all the original keys are of the same type. Typecast is different from normal Matlab type conversions. Instead of rounding or widening, `typecast` will exactly preserve the underlying bit patterns, just repackaging them in a new array type. For example, if you have a `double` x, doing `typecast(x, 'uint16')` will give you back a 4-long array of `uint16` whose values look nothing like the number that `x` represents. But it's a lossless transformation. And any two different numbers of a given input type will give different results after the typecast. – Andrew Janke Oct 19 '12 at 15:05

3 Answers3

4

Here is a kludge that does what you want . . .

map = java.util.HashMap;    
key = java.util.Vector;

matKey = 1:3;
for nn=1:numel(matKey)  
    key.add(matKey(nn));
end

map.put(key,zeros(2,1));
map.get(key)

..it is a starting point anyway.

learnvst
  • 15,455
  • 16
  • 74
  • 121
  • 1
    This could work, and be better than a "kludge", but you have to be careful - it's cheating a bit by reusing the Java object instance in `key`, when I think OP wants to be able to pull entries out by value. OP's code doesn't work because the Matlab `1:3` gets converted to a Java primitive double array, which has equality-by-identity semantics. Your first example will work if you end up with a Vector of Doubles, which will have equality-by-value semantics. Not sure how the conversion will go; you might need to force it by doing `key.add(java.lang.Double(matKey(nn)));`. – Andrew Janke Oct 10 '12 at 07:05
  • 1
    Second example probably won't work - `key.add(1:3)` ends up with a one-long Vector of double[], which ends up with equality-by-identity. I don't think you will be able to pull the value back out using a different `1:3`; you'd need the original `key` object. E.g. if you do `key2 = java.util.Vector; key2.add(1:3); map.get(key2)`, does it retrieve the value? Because I think that's what OP will need for it to work. – Andrew Janke Oct 10 '12 at 07:08
  • @AndrewJanke you were correct. After testing, Kludge 2 didn't actually work. Removed – learnvst Oct 10 '12 at 18:26
1

I compared containers.Map with char keys (thanks to Andrew Janke) to java.util.HashMap with a wrapper object as key (as in this post, also thanks to Andrew Janke, and thanks to Rasman for pointing it out):

numvec = 10^5;
S = round(rand(numvec,10)*40);

matmap = containers.Map();
%pick a random vector
idx = ceil(rand()*numvec);
s1 = S(idx,:);

%put it in the map
matmap(char(s1)) = zeros(1,4);
for i=1:5*10^5

  if i==10^3 tic; end %allow some time for getting up to speed before timing

  %pick a random vector and put it in the map
  idx = ceil(rand()*numvec);
  s2 = S(idx,:);
  matmap(char(s2)) = zeros(1,4);

  %retrieve value of previous vector
  v = matmap(char(s1));

  %modify it randomly and put it back
  v( ceil(rand()*4) ) = rand();
  matmap(char(s1)) = v;

  s1 = s2;
end
toc

javaaddpath('/Test/bin');
import test.ArrayKey;
javmap = java.util.HashMap;

idx = ceil(rand()*numvec);
s1 = S(idx,:);

%also convert value to ArrayKey so we can retrieve it by ref -- saves a put
%operation
javmap.put(ArrayKey(s1), ArrayKey(zeros(1,4)));
for i=1:5*10^5

  if i==10^3 tic; end

  idx = ceil(rand()*numvec);
  s2 = S(idx,:);
  javmap.put(ArrayKey(s2), ArrayKey(zeros(1,4)));
  v = javmap.get(ArrayKey(s1));
  v.x( ceil(rand()*4) ) = rand();
  s1 = s2;
end
toc

Result:

>> testmaps
Elapsed time is 58.600282 seconds.
Elapsed time is 97.617556 seconds.

containers.Map is the winner.


Edit: I reran the test for numvec = 10^6 and everything else the same. containers.Map approach ran in 59 seconds. HashMap approach wasn't finished after 5 minutes and caused Matlab to become unresponsive.


Edit2: I also tried pre-allocating two separate matrices and find keys using ismember. Performance was worse than HashMap.

Community
  • 1
  • 1
Matt
  • 282
  • 3
  • 13
  • Cool. Makes sense that containers.Map outperforms: there's overhead to each Java call from M-code, and the key conversion uses multiple calls, increasing with the key length. – Andrew Janke Oct 11 '12 at 06:17
0

I recently had to deal with a similar problem, not with vectors but with arrays.

Matlab has a function mat2str function that converts a matrix to a string. If you don't need the vectors to grow dynamically in the HashMap, you can represent the vector as a string and use that as your key/value. In some situations this probably isn't very helpful, but it is a quick and dirty solution if things are static.

tmldwn
  • 435
  • 4
  • 13