0

This may seem like a stupid question, but is it possible to retrieve only part of a webpage (as in, have the server send only a particular <div>)? I know it's possible to only get the HEAD of a page via HTTP (at least in Python).

I think it's in direct violation of the way HTTP GET works but I decided to ask anyway.

I'm thinking about webscraping thousands of pages, and I noticed the data usage gets pretty high. I don't need all of the page, just the relevant part.

Community
  • 1
  • 1
Abhishek Divekar
  • 1,131
  • 2
  • 15
  • 31

1 Answers1

2

It depends what you mean by "specific part of the page".

The HTTP protocol does allow to ask for part of content using starting position and response size, see Range header as described in other SO question Retreive part of web page.

If you want to get something like "just the table on the page", you are out of luck, as there is no way to express this kind of request in HTTP.

Community
  • 1
  • 1
Jan Vlcinsky
  • 42,725
  • 12
  • 101
  • 98