0

I am using R 3.4.4., Os windows 7.

I want to use rvest to make a session and scrape some data. Here is my code:

  sess<- html_session("https://www.tomato.com.hr/telefonski-imenik", encoding = "UTF-8")
  test <- read_html(sess, encoding = "UTF-8")

If you look at test it says:

[2] <body>The requested URL was rejected. Please consult with your administrator.<br><br>Your support ID is: 9986573 ...

does it mean that the site has some kind of protection or there is some other problem?

GGamba
  • 13,140
  • 3
  • 38
  • 47
Mislav
  • 1,533
  • 16
  • 37
  • I just tried the request on the same page in python and it got the page fine. It might have been a temporary server side issue. – Ontamu May 28 '18 at 08:46
  • could you please send python code so I can try? – Mislav May 28 '18 at 08:47
  • What exactly do you need to extract so I can write it? I just tried the request.get and it worked. – Ontamu May 28 '18 at 08:50
  • 1
    Not sure why, but giving it a user agent makes it work: `html_session("https://www.tomato.com.hr/telefonski-imenik", encoding = "UTF-8", user_agent(''))` – GGamba May 28 '18 at 08:53
  • I also got 200 status code after session in rvest package, but when I called read_html(session) it returns above text. I have also tried with POST function in httr package. Lets say you want to enter "Horvat" in `lname` input and then submit. You can try the extract first output. – Mislav May 28 '18 at 08:53
  • GGamba, your solution works. First time so see this... – Mislav May 28 '18 at 08:55
  • https://stackoverflow.com/questions/8892197/how-to-resolve-error-the-requested-url-is-rejected – Hassan Jamil May 27 '19 at 07:07

0 Answers0