-2

This is my code snippet for where the traceback call shows an error:

def categorize(title):
   with conn:
      cur= conn.cursor()
      title_str= str(title)
      title_words= re.split('; |, |\*|\n',title_str)
      key_list= list(dictionary.keys())
      flag2= 1
      for word in title_words:
         title_letters= list(word)
         flag1= 1
         for key in key_list:
           if key==title_letters[0]:
              flag1= 0
              break

      if flag1== 0:

        start=dictionary.get(title_letters[0])
        end= next_val(title_letters[0])

        for i in xrange (start,end):
           if word==transfer_bag_of_words[i]:
              flag2= 0
              break

      if flag2== 0:
         cur.execute("select Id from articles where title=title")
         row_id= cur.fetchone()
         value= (row_id,'1')
         s= str(value)
         f.write(s)
         f.write("\n")
         break

  return


def next_val(text):
   for i,v in enumerate(keyList):
      if text=='t':
         return len(transfer_bag_of_words)
      elif v==text:
         return dictionary[keyList[i+1]]

This is the traceback call:

Traceback (most recent call last):
File "categorize_words.py", line 93, in <module>
 query_database()
File "categorize_words.py", line 45, in query_database
 categorize(row)
File "categorize_words.py", line 67, in categorize
 for i in xrange (start,end):
TypeError: an integer is required

I have not given the whole code here. But I will explain what I am trying to do. I am trying to import a particular field from a sqlite database and checking if a single word of the field matches with a particular bag of words I already have in my program. I have sorted the bag of words aphabetically and assigned every starting of a new letter to it's index using python dictionary. This I have done so that everytime I check a word of the field being present in the bag of words, I do not have to loop through the entire bag of words. Rather I can just start looping from the index of the first letter of the word.

I have checked that the return type of get() in dictionary is int and the function nextVal also should return an int since both len() and dictionary[keylist[i+1]] are int types.

Please help.

EDIT

This is my entire code:

import sqlite3 as sql
import re

conn= sql.connect('football_corpus/corpus2.db')

transfer_bag_of_words=['transfer','Transfer','transfers','Transfers','deal','signs','contract','rejects','bid','rumours','swap','moves',
                   'negotiation','negotiations','fee','subject','signings','agreement','personal','terms','pens','agent','in','for',
                   'joins','sell','buy','confirms','confirm','confirmed','signing','renew','joined','hunt','excited','move','sign',
                   'loan','loaned','loans','switch','complete','offer','offered','interest','price','tag','miss','signed','sniffing',
                   'remain','plug','pull','race','targeting','targets','target','eye','sale','clause','rejected',
                   'interested']

dictionary={}
dictionary['a']=0;
keyList=[]
f= open('/home/surya/Twitter/corpus-builder/transfer.txt','w')

def map_letter_to_pos():
   pos=0
   transfer_bag_of_words.sort()
   for word in transfer_bag_of_words:
      flag=1
      letters= list(word)
      key_list= list(dictionary.keys())
      for key in key_list:
         if key==letters[0]:
            flag=0
            break

      if flag==1:
        dictionary[letters[0]]=pos
        pos+=1
      else:
        pos+=1

   keyList= sorted(dictionary.keys())

def query_database():
   with conn:
      cur= conn.cursor()
      cur.execute("select title from articles")
      row_titles= cur.fetchall()

      for row in row_titles:
         categorize(row)

def categorize(title):
   with conn:
      cur= conn.cursor()
      title_str= str(title)
      title_words= re.split('; |, |\*|\n',title_str)
      key_list= list(dictionary.keys())
      flag2= 1
      for word in title_words:
         title_letters= list(word)
         flag1= 1
         for key in key_list:
            if key==title_letters[0]:
               flag1= 0
               break

      if flag1== 0:

        start=dictionary.get(title_letters[0])
        end= next_val(title_letters[0])

        for i in xrange (start,end):
           if word==transfer_bag_of_words[i]:
              flag2= 0
              break

      if flag2== 0:
         cur.execute("select Id from articles where title=title")
         row_id= cur.fetchone()
         value= (row_id,'1')
         s= str(value)
         f.write(s)
         f.write("\n")
         break

   return


def next_val(text):
   for i,v in enumerate(keyList):
      if text=='t':
         return len(transfer_bag_of_words)
      elif v==text:
         return dictionary[keyList[i+1]]

if __name__=='__main__':
   map_letter_to_pos()
   query_database()

And this is the downloadable link to the database file http://wikisend.com/download/702374/corpus2.db

Code Bunny
  • 27
  • 6
  • 2
    This needs a [mcve]. If the type of those two variables truly were int, this wouldn't be an issue. – Morgan Thrapp Jul 14 '16 at 17:32
  • @joelgoldstick I do not understand. I have provided the part of the code which is giving the error – Code Bunny Jul 14 '16 at 17:35
  • 1
    `start` and/or `end` aren't integers. I know you said you checked that they are, but you're wrong. – John Gordon Jul 14 '16 at 17:37
  • What @MorganThrapp means is that if we can't run that code on our machines to test and get the same error that you do. – Skam Jul 14 '16 at 17:38
  • Where is `bag_of_words` defined? You use it twice, but I can't see where you pass it into either function? – joel goldstick Jul 14 '16 at 17:40
  • also, print the values of start and end – joel goldstick Jul 14 '16 at 17:42
  • You do not show what is `dictionary` while this is where you get your `start`. Do you really intend this main function as a closure? If not, pass `dictionary` as a parameter (and it's bad name anyway, it should be called after its supposed character or purpose, that it is of `dict` type should be obvious). `end` comes from `next_val` call over what seems to be a character. Not seeing `next_val` definition, we can't know what it outputs. – LetMeSOThat4U Jul 14 '16 at 17:43
  • It looks like you're trying to splice a dict by using keys as if they were an array index. If so, that's a fundamental misunderstanding of the data types. Perhaps this will point you in the direction of where I think you might be wanting to go: http://stackoverflow.com/questions/4558983/slicing-a-dictionary-by-keys-that-start-with-a-certain-string – Kenny Ostrom Jul 14 '16 at 18:17
  • I'd suggest for a minimal example, you eliminate the database completely, and have query_database return a list of 10 hardcoded titles in the source file. – Kenny Ostrom Jul 14 '16 at 18:23

1 Answers1

0

map_letter_to_pos attempts to modify the global variable keyList without specifying it as a global variable, therefore it only modifies a local copy of keyList and then discards it. This causes next_val to have nothing to iterate, so it never reaches the if elif, and returns None.

end = None
range(start,end) # None is not an int
Kenny Ostrom
  • 5,639
  • 2
  • 21
  • 30
  • also - busted. You were so sure start and end are int when you never looked, and the error message clearly told you they weren't. :) – Kenny Ostrom Jul 14 '16 at 18:48