0

I am trying to learn simple automation. I have set up an Ubuntu Server and I want to configure it to download html source from a specific URL and append to a file in a specified folder on the server every 1 minute.

The URL is just basic html with no CSS whatsoever.

I want to use python but admittedly can use any language. What is a good, simple day to do this?

user8363
  • 55
  • 1
  • 9

2 Answers2

0

Just pip install the requests library.

$ pip install requests

Then, it's super easy to get the HTML (put this in a file called get_html.py, or whatever name you like):

import requests

req = requests.get('http://docs.python-requests.org/en/latest/user/quickstart/')

print(req.text)

There are a variety of options for saving the HTML to a directory. For example, you could redirect the output from the above script to a file by calling it like this:

 python get_html.py > file.html

Hope this helps

Jeff K
  • 96
  • 4
  • I would recommend using pip3 and python3. A word of caution- when you name a file make sure that you don't name it something similar to an existing module. A simple typo could cause nasty errors. For ex: naming a file "random.py" or "requests.py". "request.py" works, but be careful. – rohithpr Jun 01 '15 at 17:31
0

Jeff's answer works for a one time use. You could do this to run it repeatedly-

import time
import requests

while True:
    with open('filename.extension', 'a') as fp:
        newHtml = requests.get('url').text
        fp.write(newHtml)
    time.sleep(60)

You could run this as a background process for as long as you want.

$ python3 script_name.py &
rohithpr
  • 6,050
  • 8
  • 36
  • 60