0

I can run the following on a csv file in order to get the delimited text from the file.

#!/usr/bin/perl
use strict;
use warnings;
use Text::Balanced q/extract_delimited/;

my $filecontents = do { local $/; <> };

while (my $item = extract_delimited($filecontents, '"')) {
    print "Item: $item\n";
}

but the results always include the quotes which is what I do not want so I tried the following to completely isolate the multi line record

#!/usr/bin/perl
use strict;
use warnings;
use Text::Balanced qw/gen_delimited_pat/;

my $filecontents = do { local $/; <> };
$patstring = gen_delimited_patq(\G(?:[^"]|""|""")* ]))

while (my $item = extract_delimited($filecontents, '"')) {
    print "Item: $item\n";
}

since I know this regex

\G(?:[^"]|""|""")*

finds the complete multi line record that I would like to then process with Text::Markdown however I get errors that

  • Use of ?PATTERN? without explicit operator is deprecated at line 10.
  • Global symbol "$patstring" requires explicit package name at line 10.
  • Search pattern not terminated at line 10.

I am trying to only get the delimited text for record that looks something like this excluding the beginning and ending quote I hope this makes sense:

"description" "Star-Lite 2-Person w/Fly Aluminum, Rust

Specifications:

  • Packed size: 13"" X 5""
  • 1 Door
  • Interior Area: 41.25 sq. ft.
  • Peak Height: 44""
  • Floor Material: 190T polyester, 2000mm P.U. coated
  • Mesh: No-see-um
  • Number of poles: 2 shock corded aluminum 8.5 mm.
  • Pole sections: 12"" lengths.
  • Rainfly Included.
  • 90"" X 66"" X 44"""

Excluding the first row I only want

Star-Lite 2-Person w/Fly Aluminum, Rust

Specifications:

  • Packed size: 13"" X 5""
  • 1 Door
  • Interior Area: 41.25 sq. ft.
  • Peak Height: 44""
  • Floor Material: 190T polyester, 2000mm P.U. coated
  • Mesh: No-see-um
  • Number of poles: 2 shock corded aluminum 8.5 mm.
  • Pole sections: 12"" lengths.
  • Rainfly Included.
  • 90"" X 66"" X 44""

What do I need to do to fix my pattern for this module?

EDIT: Pasted the wrong script that worked

capnhud
  • 443
  • 3
  • 13
  • 28

2 Answers2

1

A bit inelegant, but this will do what I think you want to do:

#!/usr/bin/perl
use strict;
use warnings;
use Text::Balanced qw/extract_delimited extract_multiple/;

my $filecontents = do { local $/; <> };

#replace newlines with pipes
$filecontents=~s/\n/\|/g;
$filecontents=~s/""/inches/g;
#grab all your delimited substrings into an array
my @extracted = extract_multiple($filecontents,
                            [ sub {extract_delimited ($_[0],q{"})}],
                            undef, 1);

foreach my $fragment(@extracted){
    #remove "
    $fragment=~s/"//g;
    $fragment=~s/inches/""/g;
    $fragment=~s/\|/\n/g;
    print "$fragment\n";  
}
wdd39
  • 11
  • 2
  • this gives errors - Global symbol "$filecontents" requires explicit package name at line 11. - Global symbol "$filecontents" requires explicit package name at line 13. - Execution of C:\wamp\bin\Perl\playground\trial5.pl aborted due to compilation errors. – capnhud Sep 23 '12 at 20:01
  • works for me... turn off strict if it's making that moan, but it shouldn't be: **my** $filecontents = do { local $/; <> }; – wdd39 Sep 23 '12 at 20:25
  • Since this file has 5000+ records in the description field that are variable in length perl keeps choking on trying to process this. – capnhud Sep 23 '12 at 20:51
0
Global symbol "$patstring" requires explicit package name at line 10.

You have strict on and forgot to declare the $patstring variable.

Use of ?PATTERN? without explicit operator is deprecated at line 10

gen_delimited_pat takes a string. You've passed it... well, you've passed it a syntax error. I guess it's supposed to be a regex? Perl has, in desperation, tried to parse it as a ?PATTERN? using the single question mark and then given up.

Neither example you give should ever have worked. Both contain the same errors above. There is no Text::Balanced function called gen_delimited_patq (it is gen_delimited_pat), neither exports the correct functions from Text::Balanced and $patstring is never used.

Schwern
  • 153,029
  • 25
  • 195
  • 336
  • the first example was pasted wrong it was essentially what I was trying to do with the second one. – capnhud Sep 23 '12 at 19:01