-1

I'm trying to logging text from this URL site: http://radio.nolife-radio.com:8000/played.html into a text file. I've decided to try using the Python Logging module. So far I got nothing, I have been reading some here: http://docs.python.org/dev/library/logging.html Not sure if I should use the SocketHandler or HTTPHandler. I'm quite new to this and still looking through the tutorials. There might be an easier solution using Urllib or something I don't know. The URL site is a radio station and is updated after each track. I want the updated information to be logged. Here is the progress so far:

import logging, logging.handlers

logger = logging.getLogger('Radio Station')
logger.setLevel(logging.INFO)
fh = logging.FileHandler('thread.log')
fh.setLevel(logging.INFO)
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
fh.setFormatter(formatter)
logger.addHandler(fh)
host = 'localhost:8000'
url = 'www.radio.nolife-radio.com:8000/played.html'
http_handler = logging.handlers.HTTPHandler(host, url, method='GET')
logger.addHandler(http_handler)
logger.info("")

The code above doesn't work at the moment. If I remove the HTTP code, this is the outcome:

2013-11-11 00:22:19,640 - Radio Station - INFO -

Any help would be appreciated.

1 Answers1

1

OK, a quick example here with urllib that should be fine on Windows, you will have to solve the problem of what to do with the HTML you get (I recommend that you use the Beautiful Soup module for parsing HTML).

from urllib import FancyURLopener

page_url = "http://radio.nolife-radio.com:8000/played.html"

class myUrlOpener( FancyURLopener ):
    version = "Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11"

opener = myUrlOpener()

page_contents = opener.open( page_url ).read()

print page_contents

This may be a little more complicated than the basic examples you will find on the internet, as this site doesn't seem to accept requests from the Python urllib User Agent. Here, by invoking FancyURLopener we can set the User Agent to be that of Firefox on Windows.

Check that site's (nolife-radio.com) policy regarding the scraping of content from their pages.

msturdy
  • 10,479
  • 11
  • 41
  • 52