I would like to ask if is there a proper way to retrieve (do not save/download locally) all the files that are necessary to properly display a given HTML page and their information (page size etc.) with python urllib
? This includes such things as inlined images, sounds, and referenced stylesheets.
I searched and found that wget
can perform the described procedure using --page-requisites
flag but the performance is not the same and I don't want to download anything locally. Furthermore, the flag -O/dev/null
is not working with what I want to achieve.
My final goal is to hit the page (hosted locally), gather page info and move on.
Any tips, reading references is appreciated.