1

I am having issues with this setup. In summary, once the user presses submit on a form then the data is passed to an RQWorker and Redis to process.

The error from rqworker is

23:56:44 RQ worker u'rq:worker:HAFun.12371' started, version 0.5.6
23:56:44 
23:56:44 *** Listening on default...
23:56:57 default: min_content.process_feed.process_checks(u'http://www.feedurl.com/url.xml', u'PM', u'alphanumeric', u'domain@domain.com') (9e736730-e97f-4ee5-b48d-448d5493dd6c)
23:56:57 ImportError: No module named min_content.process_feed
Traceback (most recent call last):
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/worker.py", line 568, in perform_job
    rv = job.perform()
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/job.py", line 495, in perform
    self._result = self.func(*self.args, **self.kwargs)
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/job.py", line 206, in func
    return import_attribute(self.func_name)
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute
    module = importlib.import_module(module_name)
  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
ImportError: No module named min_content.process_feed
Traceback (most recent call last):
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/worker.py", line 568, in perform_job
    rv = job.perform()
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/job.py", line 495, in perform
    self._result = self.func(*self.args, **self.kwargs)
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/job.py", line 206, in func
    return import_attribute(self.func_name)
  File "/var/www/min_content/min_content/venv/local/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute
    module = importlib.import_module(module_name)
  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
ImportError: No module named min_content.process_feed
23:56:57 Moving job to u'failed' queue

I have tried starting rqworker in a variety of ways

rqworker --url redis://localhost:6379
rqworker 

views.py

from min_content import app
from flask import render_template
from .forms import SubmissionForm
from flask import request
from .process_feed import process_checks #this is the function that does the checks
from redis import Redis
from rq import Queue



def process():
    feedUrl = request.form['feedUrl']
    source = request.form['pmsc']
    ourAssignedId = request.form['assignedId']
    email_address = request.form['email_address']

    conn = redis.StrictRedis('localhost', 6379, 0)
    q = Queue(connection=conn)

    result = q.enqueue(process_checks, feedUrl,source,ourAssignedId, email_address)
    return 'It\'s running and we\'ll send you an email when its done<br /><br /><a href="/">Do another one</a>'

process_feed has a function called process_checks which works as expected.

I know this is working because using the below line, instead of RQ, works fine.

do_it = process_checks(feedUrl,source,ourAssignedId)

The strange thing is that this all worked perfectly well before I closed my SSH connection to the VPS.

Running ps -aux returns this which indicates the redis is running

root     11894  0.1  0.4  38096  2348 ?        Ssl  Oct25   0:01 /usr/local/bin/redis-server *:6379 

Restarting redis does nothing, nor does restarting apache2

sudo service redis_6379 start
sudo service redis_6379 stop
sudo service apache2 restart

I followed this guide exactly and like I said, this worked until I terminated the SSH connection to my VPS

I'm running in a virtual environment if that makes any difference, I am calling this within my WSGI file

min_content.wsgi

#!/usr/bin/python
activate_this = '/var/www/min_content/min_content/venv/bin/activate_this.py'
execfile(activate_this, dict(__file__=activate_this))
import sys
import logging
logging.basicConfig(stream=sys.stderr)
sys.path.insert(0,"/var/www/min_content")

from min_content import app as application
application.secret_key = 'blah blah blah

'

I have confirmed that the Redis server is running by adding this to the script

r = redis.StrictRedis('localhost', 6379, 0)
r.set(name='teststring', value='this is a test')
test_string = r.get(name='teststring')
print test_string

Running redis-cli returns 127.0.0.1:6379>

process_feed.py

import requests
import xml.etree.ElementTree as ET
import csv

def process_checks(feedUrl,source,ourAssignedId):
    feed_url = feedUrl
    source = source
    ourAssignedId = ourAssignedId

    all_the_data = []   

    #grab xml from URL
    try:
        r = requests.get(feed_url)
    except Exception as e:
        print "Failed to grab from " + feed_url
        return "Failed to grab from " + feed_url


    root = ET.fromstring(r.text)

    for advertiser in root.iter('advertiser'):
        assignedId = advertiser.find('assignedId').text
        if assignedId==ourAssignedId:
            #only process for PMs using our assignedId
            for listings in advertiser.iter('listingContentIndexEntry'):

                listingUrl = listings.find('listingUrl').text
                print "Processing " + listingUrl

                #now grab from URL
                listing_request = requests.get(listingUrl)

                #parse XML from URL
                #listing_root = ET.xpath(listing_request.text)

                if not ET.fromstring(listing_request.text.encode('utf8')):
                    print "Failed to load XML for" + listingUrl
                    continue
                else:
                    listing_root = ET.fromstring(listing_request.text.encode('utf8'))


                #'Stayz Property ID','External Reference','User Account External Reference','Provider','Address Line1','Active','Headline','Listing URL'
                stayzPropertyId = '' #the property manager enters this into the spreadsheet

                if not listing_root.find('.//externalId').text:
                    print 'No external Id in ' + listingUrl
                    listingExternalId = 'None'

                else:
                    listingExternalId = listing_root.find('externalId').text
                    listingExternalId =  '"' + listingExternalId + '"'


                userAccountExternalReference = assignedId
                print userAccountExternalReference
                provider = source
                addressLine1 = listing_root.find('.//addressLine1').text
                active = listing_root.find('active').text

                if not listing_root.find('.//headline/texts/text/textValue').text:
                    print 'No headline in ' + listingExternalId
                    headline = 'None'
                else:
                    headline = listing_root.find('.//headline/texts/text/textValue').text
                    headline = headline.encode('utf-8')

                if not listing_root.find('.//description/texts/text/textValue').text:
                    print 'No description in ' + listingExternalId
                    description = 'None'
                else:
                    description = listing_root.find('.//description/texts/text/textValue').text


                #now check the min content
                #headline length
                headline_length = len(headline)
                headline_length_check = 'FAIL'
                if headline_length<20:
                    headline_length_check = 'FAIL'
                else:
                    headline_length_check = 'TRUE'

                #description length
                description_length_check = 'FAIL'
                description_length = len(description)
                if description_length<400:
                    description_length_check = 'FAIL'
                else:
                    description_length_check = 'TRUE'



                #number of images
                num_images = 0
                num_images_check = 'FAIL'
                for images in listing_root.iter('image'):
                    num_images = num_images+1
                    if num_images <6:
                        num_images_check = 'FAIL'
                    else:
                        num_images_check = 'TRUE'

                #atleast one rate
                num_rates = 0
                num_rates_check = 'FAIL'
                for rates in listing_root.iter('rate'):
                    num_rates = num_rates+1
                    if num_rates < 1:
                        num_rates_check = 'FAIL'
                    else:
                        num_rates_check = 'TRUE'

                #atleast one bedroom


                #atleast one bathroom

                #a longitude and latitude





                #now add to our list of lists
                data = {'stayzPropertyId':'','listingExternalId':listingExternalId,'userAccountExternalReference':userAccountExternalReference,'provider':provider,'addressLine1':addressLine1,'active':active,'headline':headline,'listingUrl':listingUrl,'Headline > 20 characters?':headline_length_check,'Description > 400 characters?':description_length_check,'Number of Images > 6?':num_images_check,'At least one rate?':num_rates_check}
                #data_dict = ['',listingExternalId,userAccountExternalReference,provider,addressLine1,active,headline,listingUrl]

                all_the_data.append(data)







    files_location = './files/' + source + '__' + ourAssignedId + '_export.csv'
    with open(files_location,'w') as csvFile:
    #with open('./files/' + source + '_export.csv','a') as csvFile:
        fieldnames = ['stayzPropertyId','listingExternalId','userAccountExternalReference','provider','addressLine1','active','headline','listingUrl','Headline > 20 characters?','Description > 400 characters?','Number of Images > 6?','At least one rate?']
        writer = csv.DictWriter(csvFile,fieldnames=fieldnames)
        writer.writeheader()
        for row in all_the_data:
            try:
                writer.writerow(row)
            except:
                print "Failed to write row " + str(row)
                continue


    #send email via Mailgun
    return requests.post(
        "https://api.mailgun.net/v3/sandboxablahblablbah1.mailgun.org/messages",
        auth=("api", "key-blahblahblah"),
        #files=("attachment", open(files_location)),
        data={"from": "Mailgun Sandbox <postmaster@.mailgun.org>",
              "to": "Me <me@me.com>",
              "subject": "Feed Processed for " + ourAssignedId,
              "text": "Done",
              "html":"<b>Process the file</b>"})
Franco
  • 2,846
  • 7
  • 35
  • 54
  • 1
    Can you show the code for process_checks and how it's name-spaced? RQ has a bunch of name spacing issues because it tries to pickle and unpickle functions/methods passed to it. – Eli Oct 27 '15 at 01:12
  • 1
    Yeah, this looks like a name spacing thing. Can you try importing min_content.process_feed in wherever you're having RQ run? The error is pretty clear that the issue is surrounding that: `ImportError: No module named min_content.process_feed`. We can test this hypothesis by temporarily making `process_checks()` function just do something very simple, like `return True` and calling in the same way to see if that works. – Eli Oct 27 '15 at 06:21
  • Yes when I changed the `process_checks()` function to only `return 'hello'` then it worked.....anything I can read to understand how to avoid the name spacing issues? Appreciate your help! – Franco Oct 27 '15 at 22:24
  • You could try opening a new issue in rq's issues page. I'm thinking posts like this are related: https://github.com/nvie/rq/issues/432 – Eli Oct 27 '15 at 23:53

0 Answers0