1

I'm trying to parse one site. I'm sending a naked http request, just by setting a User-Agent.

It's surprising how a request using "requests" returns a 200 response. But the same query using httpx returns 403. I have tried making the request with both the 1.1 version of the http protocol and the 2.0 version. It didn't give any result.

I know that this site uses some protection against bots.

But why does the primitive "requests" do the job, as opposed to httpx?

I noticed that httpx adds a 'Host' header in its default request, which 'requests' does not. But I don't know how to get rid of it.

30 sec video: https://i.imgur.com/tWOe0sZ.mp4

Ivan
  • 37
  • 6
  • I have similar problem. I read the server.log, The raw headers exactly the same for both requests and httpx. different only the sequence. but request get 200 and httpx get 403. Seeking solution., for almost 1 year – Jupri Jul 14 '23 at 09:15
  • Not sure if this will help the current question but I have struggled for hours to understand why a call made with `requests` worked fine while the same call made with `httpx` did not. Turns out, a redirect was involved in my request. Adding `follow_redirects=True` solved my issue. – Sorix Aug 17 '23 at 14:52
  • @Sorix Thanks for the feedback, but it's not about the redirect here. Redirects return a 3XX status code. – Ivan Aug 25 '23 at 14:40

0 Answers0