Including Hashes within Hashes in Perl

Question

G'Day,

I'm currently working on creating big hashes from a lot of smaller hashes. Let's say that these smaller hashes are defined in a file each, and then can be included by the bigger hash.

For eg, let's look at some small hashes

File personcontact.pl:

   return {
            \'firstname\' => {
                \'__type\' => \'String\'
            },
        \'lastname\' =>  {
            \'__type\' => \'String\'
            },
        %{include("/tmp/address.pl")}
    }

File address.pl:

return {
        \'address\' => {
        \'street\' => {
            \'__type\' => \'String\'
            },
        \'unit\' => {
            \'__type\' => \'String\',
            \'__validation_function\' => {
                \'is_a_number\' => \'\'
            },
            \'__schema_constraints\' => {
                \'is_not_null\' => \'\'
            }
        },
        \'suburb\' => {
            \'__type\' => \'String\'
        },
        \'__type\' => \'ARRAY\'
        }
    }

And I've got a considerable number of these...

The way I'm trying to re-create the hash is using the include subroutine, which looks like this:

 sub include {
my ($filename) = @_;
my $file; 
open(my $fh, "<", $filename) or die ("FILEOPEN: $!");
while(my $line = <$fh>) { $file .= $line; }
my $result = eval $file;
die("EVAL: $@") if $@;
close($fh) or die("FILECLOSE: $!");
return $result;
 }

I know I must be doing something wrong, but I'm not sure what. I keep on getting errors like Useless use of a variable in void context at (eval 11) line 4, <SCHEMAFILE> line 6 or Odd number of elements in anonymous hash at (eval 11) line 5, <SCHEMAFILE> line 6. I'm not sure how to go about finding (eval 11) line 4-3, line 6 though. Any suggestions on use of Perl debuggers or any pointers on where I might be going wrong will be much appreciated.

Thanks!

No need to read line by line. Use "slurp mode" by putting `local $/;` before reading the file, and change `while(my $line...)` to `my $line = <$fh>;`. — Mikel, Jan 24 '11 at 04:35
Thanks for the tip! As you might've guessed, I'm fairly new to Perl :) — Gaurav Dadhania, Jan 24 '11 at 04:37
Something like YAML may also be more appropriate. http://search.cpan.org/dist/YAML/lib/YAML.pm — Mikel, Jan 24 '11 at 04:39
What's the `$this` at line 2 of your `include` subroutine for? — Mikel, Jan 24 '11 at 04:40
You're basically rewriting Perl's `do`. Take a look at `perldoc -f do`. Better yet would be using a proper Perl module, but you can cross that bridge when you come to it. If you want to see how it's done in the meantime, though: `perldoc perlmod`. — Sdaz MacSkibbons, Jan 24 '11 at 04:40
YAML is an overkill for what I'm working on, I think. I'm just trying to make the already HUGE hash in the application, a bit more manageable. :) — Gaurav Dadhania, Jan 24 '11 at 04:42
@Sdaz Yup, `do` should be able to get the job done for the moment being. Thanks! However, I still get the same error `Useless use of variable in void context at (eval 11) line 3 , line 5.` How do you locate that line? Perldiag isn't very helpful in this case either. — Gaurav Dadhania, Jan 24 '11 at 04:51
@Mikel typo - It was originally supposed to a reference to self, but that was for testing only. Sorry. — Gaurav Dadhania, Jan 24 '11 at 04:55
I'd have to see updated code to give a definitive answer, but generally, somewhere, you're probably doing something like `$foo = ('junk','list','blah');` or along those lines (i.e. using a scalar in list context). It can be hard to trace through eval, but check where you're calling `do`, and what you're setting it to. Or post updated code. — Sdaz MacSkibbons, Jan 24 '11 at 05:02
Furthermore, you seem to also be rewriting an ORM (object-relational mapper) from scratch, to abstract away SQL statements. There's other solutions for this, like `DBIx::Class`, `Class::DBI`, etc. — Sdaz MacSkibbons, Jan 24 '11 at 05:08
Are you sure those backslashes are beneficial? I would not expect them to be in the data files. — Jonathan Leffler, Jan 24 '11 at 05:30
@Jonathan It does work without the backslashes, I just wanted to be careful they were not somehow causing trouble :) — Gaurav Dadhania, Jan 24 '11 at 05:52
@Sdaz No, what I'm trying to do is much more simpler than writing my ORM, although I'm sure there's a module which can do this elegantly (but I'm trying to learn). Now that I'm using do, I'm not getting any errors, but I'm getting inconsistent answers. [Updating question] — Gaurav Dadhania, Jan 24 '11 at 05:54

score 11 · Accepted Answer · edited Jan 24 '11 at 09:00

Welcome to Perl. I hope you have a good time learning and using it.

On to business, where to start? I've got a lot to say here.

First off, it's unnecessarily risky to load data by evaluating files. If you just want to serialize data try JSON::XS or YAML, or even Storable. If you want a config file, there are many, many modules on CPAN that help with this task. Check out Config::Any.

If you want to create data structures to load via eval (which is not really a good idea), Data::Dumper generates perl code needed to create any data structures you feed to it. The main reason I mention it is that it is far more useful as a debugging aid than a serializer.

Now that that is taken care of, if you want to load a file and evaluate it (again, not the best idea in nearly every case), you should be looking at do or require.

my $stuff = do 'address.pl';

But don't do that. String eval is a tool that is generally best left unused. 99% of the time, if you are planning on using string eval, stop and think of another way to solve the problem. Do is an implicit eval, so it counts too.

Perl gives you lots of tools to do risky and powerful magic. A large part of becoming a skilled Perl programming lies in understanding what things are risky, why and when it makes sense to use them. Don't expect Perl to baby you with fences and gates to keep you safe. Seriously consider picking up a copy of Effective Perl Programming or Perl Best Practices. As a newbie, much will go over your head as you read the first time, but either book can be a great reference as you grow and learn.

Next topic, what in the world are you trying to achieve with all those escaped quotes? It makes my head hurt to look at that stuff! Perl has some very, very nice quoting operators that you can use to avoid ever having to mess around with escaping quotes in your literal strings.

The => or fat comma automatically quotes its left hand side (LHS) as if it is alphanumerics only. But putting all the quotes and escapes makes things really dodgy.

When you say \'address\' => {}, Perl sees this as the \, the "get reference" operator applied to a string literal. In this case, an unterminated string literal, because you never offer an unescaped ' after the first.

If your aim is to use 'address', quotes and all as your hash key, you can do this:

my %foo = ( "'address'" => 'blah' );

If you don't want the quotes, which seems a far more usual use case, simply do:

my %foo = ( address => 'blah' );

On to the error messages you were getting! Perl has some pretty nice error messages once you learn what they all mean. Until then, it can be a bit tough to understand their significance. Fortunately Perl ships with a script called splain: a handy dandy tool that will explain error messages in much greater detail. You can also use the diagnostics module to get the same, expanded error messages automatically.

Now, if I was writing this I'd do something along these lines:

gen_schema_files.pl - A file to write JSON schema files. You could hand edit your schemata if you want to. You may also want to configure the output to be prettier if you want to improve readability.

#!/usr/bin/perl

use JSON::XS;
use File::Spec;

use constant BASEDIR => '.';

# Key is the file name, value is the data to put into the file.
my %schemata = (
    'address.json' => {
        address => {
            street => { __type => 'String' },
            unit => {
                __type => 'String',
                __validation_function => { is_a_number => '' },
                __schema_constraints  => { is_not_null => ''  }
            },
            suburb => { __type => 'String' },
            __type => 'ARRAY'
        },
    },

    'person_contact.json' => {
         firstname => { __type => 'String' },
         lastname =>  { __type => 'String' },

         # Use a special key to indicate that additional files should be 
         # loaded into this hash.
         INCLUDE  => [qw( address.json )], 
     },

     # And so forth
);

for my $schema ( keys %schemata ) {
    my $path = File::Spec->catfile( BASEDIR, $schema );

    open my $fh, '>', $path
        or die "Error opening '$path' for writing - $!\n";

    print $fh encode_json $schemata{$schema};
}

load_schemas.pl - this is the code that loads the schemata and does stuff. Mine only gets loaded. I have no idea what you are doing with the data...

#!/usr/bin/perl
use strict;
use warnings;

use Data::Dumper;

use JSON::XS;
use File::Spec;

use constant BASEDIR => '.';


my $schema = load_schema( 'person_contact.json' );

print Dumper $schema;


sub load_schema {
    my $file = shift;

    my $path = File::Spec->catfile( BASEDIR, $file );

    open my $fh, '<', $path
        or die "Error opening file '$path' - $!\n";

    my $json = join '', <$fh>; # reads a list of lines and cats them into one string.
                               # One way to slurp out of many.

    my $schema = decode_json( $json );

    # Handle the inclusion stuff:

    if( exists $schema->{INCLUDE} ) {
        # Copy the files to load into an array.
        my @loadme = @{$schema->{INCLUDE}};
        # delete the magic special include key.
        delete $schema->{INCLUDE};

        # Load each file and copy it into the schema hash.
        for my $load ( @loadme ) {
            my $loaded = load_schema( $load );

            # This is a bit of weird syntax.
            # We are using a hash slice assignment to copy the loaded data into the existing hash.
            # keys and values are guaranteed to come out in the same (random) order as each other.
            # the @{$foo}{blahbhal} is how you dereference a hash reference as a slice.
            @{$schema}{keys %$loaded} = values %$loaded;
        }
    }

    return $schema;
}

I've glossed over a few things, but I've tried to leave comments with enough of the terms (vocabulary or even jargon, if you like) to allow you to do profitable searches.

The above code has a couple flaws. It does not do any checking for circular inclusions (it will run for a long time and eventually fill up memory and crash--not so good). The choice of magic key may not be good. And there are probably more I haven't even thought of yet.

Perldoc is an amazing resource, but there is so much there that it takes a while to learn to find things. Take a look at the Perl Data Structures Cookbook and the Arrays of Arrays tutorial. As a beginner, I found the Perl Functions by Category section of perlfunc to be incredibly helpful.

I think I'll stop, now that I've written more than enough text to blind the average person. I hope you find this dissertation helpful. Welcome, once again, and good evening (please adjust to whatever your local time is when you find this response).

Including Hashes within Hashes in Perl

1 Answers1