0

I am using feed parser to create content from an rss feed. Its something like this:

import feedparser

def parse_rss(rss_url):
    return feedparser.parse(rss_url)

def generate_content_from_feed(feed):
    parsed_feed = parse_rss(feed.rss_url)

    for item in parsed_feed['items']:
        if not Content.objects.filter(link=item['link']).exists():
            content = Content.objects.create(
                title=item['title'],
                link=item['link'],
                description=item['description'],
                pub_date=item['published'],
                category=item['category'],
                feed=feed,
            )
            if item['enclosure']:
                content.media_url = item['enclosure']['url']
                content.media_type = item['enclosure']['type']
            content.save()

Now I am not entirely sure if the above code is working or not, as I can't test it.

In my models.py, I have these two models :

class Feed(models.Model):
    rss_url = models.URLField()

    def save(self, *args, **kwargs):
        super(Feed, self).save(*args, **kwargs)
        generate_content_from_feed(self) # Generating the content

class Content(models.Model):
    title = models.CharField(max_length=500)
    link = models.URLField()
    description = models.TextField()
    pub_date = models.DateTimeField(default=None)
    category = models.CharField(max_length=500, blank=True)
    media_url = models.URLField(blank=True) # Attached media file url
    media_type = models.CharField(max_length=50, blank=True)
    feed = models.ForeignKey(Feed, related_name='content_feed')

In case you are wondering, when a feed is saved, the content from that feed is generated and saved as Content objects in my database. Atleast thats what I am trying to do. However, when I save a feed, it gives an error saying something like this:

ValidationError at /admin/myapp/feed/add/
[u"'Fri, 08 Apr 2016 14:51:02 +0000' value has an invalid format. It   must be in YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ] format."]

How do I fix this problem? And also, I am no expert, could anybody tell me if my generate_content_from_feed method has issues or not? Thanks a lot.

darkhorse
  • 8,192
  • 21
  • 72
  • 148
  • I was wondering, why not to be able to test it? I would recommend you to debug into your code and see exactly the content of your variables prior to saving. – Wtower Apr 10 '16 at 08:46

1 Answers1

0

There may be a better way but your code should look something like this

a = 'Fri, 08 A`enter code here`pr 2016 14:51:02 +0000'

dates = re.search(r'(\w+), (\d+) (\w+) (\d{4}) (\d+):(\d+):(\d+) ([\w+]+)', a)
# YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ] format."]

day_str = dates.group(1)
day = dates.group(2)
month_str = dates.group(3)
year = dates.group(4)
hour = dates.group(5)
minute = dates.group(6)
second = dates.group(7)

new_date = "%s-%s-%s %s:%s:%s" % (year, month_str, day, hour, minute, second)
print(new_date)

>>> 2016-Apr-08 14:51:02

If you have problems again, its probably good trying to convert the Apr to a date number

tushortz
  • 3,697
  • 2
  • 18
  • 31