Okay, by all rights, this should suck AND not do what you want - But I spent the last hour trying to get it somewhat right, so I'll be damned. Each 'anything[]' is an array of two elements, each a hashref: One for elements that appear after a bare 'anything', and the second for elements appearing after a 'anything[]'. I probably should have used a closure instead of relying on that crappy $is_non_bracket variable -- I'll take another look in the morning when I'm less retarded and more ashamed of writing this.
I think that it's tail-call optimized (the goto &SUB part). It also makes (small) use of named captures.
use strict;
use warnings;
use 5.010;
use Data::Dumper;
sub construct {
my $node = shift;
return unless @_;
my $next = shift;
my $is_non_bracket = 1;
$next .= '[]' and $is_non_bracket-- if exists $node->{ $next . '[]' };
if ( $next =~ / (?<node>[^\[\]]+) \Q[]/x ) {
if ( exists $node->{ $+{node} } or not defined( $node->{$next} ) ) {
push @{ $node->{$next} }, (delete $node->{ $+{node} } // {}); #/
}
unshift @_, $node->{$next}->[$is_non_bracket] ||= {};
}
else {
$node->{$next} ||= @_ ? {} : $node->{$next};
unshift @_, $node->{$next} //= @_ ? {} : ''; #/
}
goto &construct;
}
my %hash;
while (<DATA>) {
chomp;
construct( \%hash, split m!/! );
}
say Dumper \%hash;
__DATA__
one/two/three
one/two[]/three
one/two[]/three/four
one/two[]/three/four/five[]
one/two[]/three/four/whatever
one/two/ELEVAN
one/three/sixteen
one/three[]/whygodwhy
one/three/mrtest/mruho
one/three/mrtest/mruho[]/GAHAHAH
EDIT: Regex had an extra space after the quotemeta that made it break down; My bad.
EDIT2: Okay, it's the morning, edited in a version that isn't so stupid. No need for the ref, as we always pass a hashref; The #/ are there to stop the //'s from borking the highlighting.
EDIT3: Just noticed you DON'T want those [] to show up in the data structure, so here's a version that doesn't show them:
sub construct {
my $node = shift;
return unless @_;
my $is_bracket = (my $next = shift) =~ s/\Q[]// || 0;
if (ref $node->{$next} eq 'ARRAY' or $is_bracket) {
if ( ref $node->{ $next } ne 'ARRAY' ) {
my $temp = delete $node->{ $next } || {};
push @{ $node->{$next} = [] }, $temp;
}
unshift @_, $node->{$next}->[$is_bracket] ||= {};
}
else {
$node->{$next} ||= @_ ? {} : $node->{$next};
unshift @_, $node->{$next} //= @_ ? {} : ''; #/
}
goto &construct;
}
EDITNaN:
Here's the gist of what it does:
If there are enough arguments, we shift for a second time and put the value in the $next, which is promptly pulled into a substitution, which takes away its [], should it have any: If it does, the substitution returns 1, otherwise, s/// returns undef (or the empty string, I forget), so we use the logical-or to set the return value to 0; Either way, we set $is_bracket to this.
Afterwards, if $node->{$next} is an arrayref or $next had brackets:
If $node->{$next} wasn't an arrayref (so we got here because $next had brackets, and it was the first time this has happened), it' either undef, the empty string, or a hashref; We delete whatever it is, and store it in $temp. We then set the now-empty $node->{$next} to an arrayref, and set(push) $temp as its first element - Meaning that, for instance, if 'two' had existed previous, and $next was originally 'two[]', then 'two' will now point to an arrayref, and its old value will be stored in [0].
Once $node->{$next} is an arrayref (or if it already was), we unshift the hashref in the index pointed by $is_backet - 0 if $next didn't have brackets, and 1 if it did - to @_. If the hashref doesn't exist (either because it's undef, for both, or possibly the empty string, for 0), we assign it an brand new hashref with the logical-or.
If it wasn't an arrayref, it's a hashref, so we do the same thing as before, and unshift the resulting value to @_.
We do all this unshifting because the magic goto passes our current @_ to the function that will replace us.