I have a RPi 4 and I want, via terminal, to generate a website.html that has the complete rendered html of a webpage.
I want to do this for example in order to search the whole page for a string or pattern etc...
I can do this using something like wget
or curl
for example wget -O website.html https://www.example.com
The above is all I want, however it doesn't support javascript.
Some websites (like Google) have almost everything in javascript, so I cannot get the final html by that way.
- I have been searching all day for a working solution, and I have
found that I need something like a headless browser. I have tried
things like
PhantomJs
but they don't work and are not longer maintained. - I have tried
Puppeteer
but I was only able to grab a screenshot. Not the Html. I thought thatpage.content()
had what I wanted but I couldn't get it/write it to a file. When Iconsole.log
ed it I saw javascript there as well... If someone knows how to do that (write a file with the final html) using Puppeteer then please tell me.
Isn't there any 'easy' solution like wget
that does javascript as well?
Isn't there a simple workflow/instructions in order to achieve something like this?
If you could tell me some working commands to do this please tell me. I find some tools very complicated and I am not familiar with all programming languages in order to make this work.
Any help would be greatly appreciated.