2

I'm trying to request an URL which has some characters that are non-ASCII, for example: http://perry.wikia.com/wiki/Página_principal which has an á symbol.

I've tried with LWP::UserAgent but it throws a 404 Not found error:

#!/usr/bin/perl

use utf8;
use LWP::UserAgent;
use Encode qw(decode encode);

my $br = LWP::UserAgent->new;
#~ my $url = 'http://perry.wikia.com/wiki/Página_principal'; # doesn't work either
my $url = encode('UTF-8','http://perry.wikia.com/wiki/Página_principal');
my $response = $br->get($url);
if ($response->{success}) {
    my $html = $response->{content};
} else {
  die "Unexpected error requesting $url : " . $response->status_line;
}

I've tried with HTTP::Tiny too, same result:

#!/usr/bin/perl

use utf8;
use HTTP::Tiny;
use Encode qw(decode encode);

my $url = 'http://perry.wikia.com/wiki/Página_principal';
#~ my $url = encode('UTF-8','http://perry.wikia.com/wiki/Página_principal'); # doesn't work either
my $response = HTTP::Tiny->new->get($url);
if ($response->{success}) {
    my $html = $response->{content};
} else {
  die "Unexpected error requesting $url : " . $response->{status};
}
oalders
  • 5,239
  • 2
  • 23
  • 34
Akronix
  • 1,858
  • 2
  • 19
  • 32
  • You've neglected to do the URI encoding: `http://perry.wikia.com/wiki/P%C3%A1gina_principal` is what you actually want to GET. – tjd Sep 29 '17 at 19:13
  • Check out [URI::Encode](https://metacpan.org/pod/URI::Encode) or [URI::Escape](https://metacpan.org/pod/URI::Escape) – tjd Sep 29 '17 at 19:23
  • I forgot to mention, I tried to use URI::Escape to escape the URI but it doesn't work either, it returns the same 404 error. Actually, If you try to just do a request to the URI already encoded: http://perry.wikia.com/wiki/P%C3%A1gina_principal it does not work :/ – Akronix Sep 29 '17 at 20:28
  • Tested with URI:Encode, same URI output and same 404 result :( – Akronix Sep 29 '17 at 20:48
  • 2
    Are you sure that this URL isn't actually a 404? Chrome and curl are giving me 404s. – oalders Oct 02 '17 at 17:40
  • 1
    you're absolutely right @oalders. I've missed it because, from the browser, the website seemed to response an actual page instead of an 404 Error. – Akronix Oct 04 '17 at 18:47
  • @Akronix perhaps best to close or remove the question then? – oalders Oct 04 '17 at 20:40
  • @oalders I voted to close it as "Unclear what you're asking" (I dind't find any other reason closer), but it requires 3 more votes to be closed :/ On the other hand, I think that It could be useful for somebody else, so I thought it's better to not remove it. – Akronix Oct 05 '17 at 14:42
  • @Akronix ok, in that case I added an answer, to save anyone from having to read all the comments. :) – oalders Oct 05 '17 at 18:26
  • 1
    @oalders all right. there you go – Akronix Oct 06 '17 at 16:33

1 Answers1

0

This is not a bug in any of the Perl modules. This URL actually does return a 404.

oalders
  • 5,239
  • 2
  • 23
  • 34