0

I'm using Selenium for work and I have extract some data from "//ul", unfortunately this data contains a newline, I tried to use chomp() function to remove this (because I need to write in a CSV's file) but it's not working, the portion of code is:

open (INFO, '>>file.csv') or die "$!";  
print INFO ("codice\;descrizione\;prezzo\;URLFoto\n");
my $sel = Test::WWW::Selenium->new( host => "localhost", 
                                    port => 4444, 
                                    browser => "*chrome", 
                                    browser_url => "http://www.example.com/page.htm" );
$sel->open_ok("/page.htm");
$sel->click_ok("//table[2]/tbody/tr/td/a/img");
$sel->wait_for_page_to_load_ok("30000");
my $descrizione = $sel->get_text("//ul");
my $prezzo = $sel->get_text("//p/font");
my $codice = $sel->get_text("//p/font/b");
my $img = $sel->get_attribute ("//p/img/\@src");
chomp ($descrizione);
print INFO ("$codice\;$descrizione\;$prezzo\;$img\n");
$sel->go_back_ok();

# Close file
close (INFO);

but the output is:

Art. S500 Set Yoga "Siddhartha";Idea regalo ?SET YOGA Siddhartha? Elegante scatola in cartone lucido contenente:  

 2 mattoni in legno naturale mis. cm 20 x 12,5 x 7

 1 cinghia in cotone mis. cm 4 x 235  

 1 stuoia in cotone mis. cm 70 x 170    

 1 manuale di introduzione allo yoga stampato

Tutto rigorosamente realizzato con materiali natural;€ 82,50;../images/S500%20(Custom).jpg
fdicarlo
  • 450
  • 1
  • 5
  • 10
  • If I recall correctly, chomp assumes unix newlines. Perhaps your data has a DOS newline? – Alex Howansky Apr 06 '12 at 15:55
  • 1
    @AlexHowansky `chomp` tries to remove whatever is contained in `$/` from the end of its string argument(s). Nothing more, nothing less. – TLP Apr 06 '12 at 18:31
  • 1
    It's not that the definition of a newline differs, but that the definition of a _line ending_ differs. – brian d foy Apr 06 '12 at 21:01

3 Answers3

1

chomp removes the platform specific end-of-line character sequence from the end of a string or a set of strings.

In your case, you seem to have a single string with embedded newlines and/or carriage returns. Hence, you probably want to replace any sequence of possible line endings with something else, let's say a single space character. In that case, you'd do:

$descrizione =~ s/[\r\n]+/ /g;
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
0

If you want to replace all vertical whitespace, Perl has a special character class shortcut for that:

 use v5.10;
 $descrizione =~ s/\v+/ /g;
brian d foy
  • 129,424
  • 31
  • 207
  • 592
-1

Use this to remove \r as well.

$descrizione =~ s#[\r\n]+\z##;

regards,

user1126070
  • 5,059
  • 1
  • 16
  • 15