0

I am struggling with accessing/ modifying hashes of unknown (i.e. dynamic) depth.

Suppose I am reading in a table of measurements (Length, Width, Height) from a file, then calculating Area and Volume to create a hash like the following:

#                       #Length  Width  Height  Results
my %results = (     
                        '2' => {        
                                '3' => {        
                                        '7' => {
                                                'Area' => 6,
                                                'Volume' => 42,
                                                },
                                        },
                                },
                        '6' => {        
                                '4' => {        
                                        '2' => {
                                                'Area' => 24,
                                                'Volume' => 48,
                                                },
                                        },
                                },
                        );

I understand how to access a single item in the hash, e.g. $results{2}{3}{7}{'Area'} would give me 6, or I could check if that combination of measurements has been found in the input file with exists $results{2}{3}{7}{'Area'}. However that notation with the series of {} braces assumes I know when writing the code that there will be 4 layers of keys.

What if there are more or less and I only discover that at runtime? E.g. if there were only Length and Width in the file, how would you make code that would then access the hash like $results{2}{3}{'Area'}?

I.e. given a hash and dynamic-length list of nested keys that may or may not have a resultant entry in that hash, how do you access the hash for basic things like checking if that key combo has a value or modifying the value?

I almost want a notation like:

my @hashkeys = (2,3,7);

if exists ( $hash{join("->",@hashkeys)} ){
    print "Found it!\n";
}

I know you can access sub-hashes of a hash and get their references so in this last example I could iterate through @hashkeys, checking for each one if the current hash has a sub-hash at that key and if so, saving a reference to that sub-hash for the next iteration. However, that feels complex and I suspect there is already a way to do this much easier.

Hopefully this is enough to understand my question but I can try to work up a MWE if not.

Thanks.

SSilk
  • 2,433
  • 7
  • 29
  • 44
  • 4
    That doesn't seem like a great data structure. For one thing, you can't have two items with the same dimensions, since hash keys are unique. Using a flat structure with keys `length`, `width`, and `height` would make a lot more sense. – ThisSuitIsBlackNot Apr 20 '16 at 18:30
  • Hi, I'm not sure that's a limitation. This is just a simplified example of something much more complex I'm doing for work, but even here, when I encounter a given set of measurements, I can index into the hash directly with them to find out the area, etc. If I see the same measurements twice, I don't need to recalculate Area, etc. Which is why I used nested hashes with each levels' keys being a separate measurement. – SSilk Apr 20 '16 at 18:45
  • Is this real data? Or have you give placeholder data, or is it homework? As [@ThisSuitIsBlackNot says above](http://stackoverflow.com/questions/36752179/perl-access-hash-of-dynamic-depth#comment61084783_36752179) I don't see why you've organized things like this in the first place. As you've seen, it's problematic. It sounds like you just have a list of *things* with a length, width, height, area and volume, some of which may be missing, and I would structure it exactly like that: as an array of five-item arrays. If depends on what your app does how you could access for specific entries faster – Borodin Apr 20 '16 at 18:47
  • 1
    If you're certain that this is what you want to do—and given what you've told us then it shouldn't be—you could take a look at [Data::Diver](https://metacpan.org/pod/Data::Diver). But it's looks like you're starting with a bad data design and trying to squeeze it into usability. The shape of your data is at your behest—there's no need to stick with something that doesn't do *exactly* what you need – Borodin Apr 20 '16 at 18:51
  • 1
    *"If I see the same measurements twice, I don't need to recalculate Area, etc. "* perhaps you should look at [Memoize](https://metacpan.org/pod/Memoize) ? – Borodin Apr 20 '16 at 18:53
  • Hi folks, thanks for the feedback. I'm starting to rethink my data structure. If I use an array of arrays, is there a simple way to check if a given combination of parameters has been checked already? In the real code I'm working on, the calculation performed on each set of parameters takes about 90 seconds and I'm looking at potentially hundreds of these so I'm eager to avoid repeating calculations for a given set of parameters, which is why I was thinking hash for storage (quickly index into it with the current set of params to see if it's been checked already). Thanks. – SSilk Apr 20 '16 at 19:29
  • Wouldn't be possible to remove the duplication first on the data structure? If you risk doing calculations twice, maybe you could go through your data to remove those duplicates first instead of trying to index the results. – LaintalAy Apr 20 '16 at 20:11
  • @eballes: It's hard to explain in the context of my question but in my real application the "calculation" step is a black box to me, performed by another machine I'm connected to and overall my goal is a minzation of its output. I can tweak several input parameters. So at each step I give it a few parameters, it spends up to several minutes calculating a result, and then based on the result I tweak one of those parameters up or down a notch. So there's inherently no duplication of full sets of parameters and I also do not have a "full set of parameters" as such at the outset. – SSilk Apr 20 '16 at 20:25
  • @SSilk I agree with Borodin's Memoize suggestion. `my $volume = calculate_volume({ length => 1, width => 2, height => 3 });` is much more readable than `my $volume = $dimensions{1}{2}{3}{volume};`, and Memoize will take care of storing the results for you so your long calculation is only done once for a given set of parameters. – ThisSuitIsBlackNot Apr 20 '16 at 21:17

2 Answers2

5

So here's a recursive function which does more or less what you want:

sub fetch {
    my $ref = shift;
    my $key = shift;
    my @remaining_path = @_;

    return undef unless ref $ref;
    return undef unless defined $ref->{$key};
    return $ref->{$key} unless scalar @remaining_path;
    return fetch($ref->{$key}, @remaining_path);
}

fetch(\%results, 2, 3, 7, 'Volume');  # 42
fetch(\%results, 2, 3);               # hashref
fetch(\%results, 2, 3, 7, 'Area', 8); # undef
fetch(\%results, 2, 3, 8, 'Area');    # undef

But please check the comment about bad data structure which is already given by someone else, because it's very true. And if you still think that this is what you need, at least rewrite it using a for-loop, as perl does not optimize tail recursion.

Artyom V. Kireev
  • 588
  • 4
  • 12
  • I'm marking this as the asnwer since it did indeed answer my question, but for anyone else out there considering a similar approach, please see the comments below my question first. I ended up changing my approach since, as discussed above, it was poorly thought out from the beginning. Thanks. – SSilk Aug 31 '16 at 20:27
1

Take a look at $; in "man perlvar".

http://perldoc.perl.org/perlvar.html#%24%3b

You may use the idea to convert variable length array into single key.

my %foo;
my (@KEYS)=(2,3,7);
$foo{ join( $; , @KEYS ) }{Area}=6;
$foo{ join( $; , @KEYS ) }{Volume}=42;
AnFi
  • 10,493
  • 3
  • 23
  • 47
  • 1
    The point of `$;` is that it will magically turn `$foo{@keys}` into `$foo{join $;, @keys }`. If you're going to do that manually you might as well use a separator easier on the eye and more fitting the data like `,`. – Schwern Apr 20 '16 at 23:41