2

I am using SQLObject with an sqlite database to store document records. The documents have been scanned the document information stored in a text file. I need to read all these text files (about 800) and load document records from them (average 40 documents per file). Currently my code is as follows (Document is the table) :-

    for doc in cc.documents:
      print doc['id']
      doc_recs = Document.selectBy(costcode = x, doc_id = int(doc['id']))
      dcmnt = doc_recs.getOne(None)
      if not dcmnt:
        dcmnt = Document(costcode = x, doc_id = int(doc['id']))

      dcmnt.title = doc['title']
      dcmnt.author = doc['author']
      dcmnt.recipient = doc['recipient']
      dcmnt.date = doc['date']
      dcmnt.firstpage = doc['firstpage']
      dcmnt.lastpage = doc['lastpage']
      dcmnt.type = doc['type']
      dcmnt.allpages = doc['allpages']

But this is taking a long time and getting longer with each record.

Presumably SQLObject is committing the record each time I change the attribute, which is wasteful. If I were just adding records I could create the attributes at the same time as the record is first created and this should be faster, but still SQLObjects seems slow at adding records.

jimscafe
  • 1,081
  • 2
  • 14
  • 24

1 Answers1

0

1st, do not update columns one by one. Instead of

    dcmnt.title = doc['title']
    dcmnt.author = doc['author']

do

    dcmnt.set(
        title = doc['title'],
        author = doc['author'],
        recipient = doc['recipient'],
        date = doc['date'],
        firstpage = doc['firstpage'],
        lastpage = doc['lastpage'],
        type = doc['type'],
        allpages = doc['allpages'],
    )

This sends one UPDATE instead of a dozen.

2nd, for mass-insertion use SQLBuilder. It is less convenient but much faster.

phd
  • 82,685
  • 13
  • 120
  • 165