1

I need to create a lookup table to store tabular data and retrieve the records based on multiple field values.

I found an example post # 15418386 which does almost what I need, however it always returns the same record regardless of the argument being passed. I listed the code at the bottom of this post, in casr the link does not work.

I have verified that the file is read correctly and the data table is being populated properly as well by using the debugger in the IDE (Im using PyCharm).

The test data included in the code is:

name,age,weight,height
Bob Barker,25,175,6ft 2in
Ted Kingston,28,163,5ft 10in
Mary Manson,27,140,5ft 6in
Sue Sommers,27,132,5ft 8in
Alice Toklas,24,124,5ft 6in

The function always returns the last record, I believe the problem is in these lines of code. But I don't understand how it works.

matches = [self.records[index]
            for index in self.lookup_tables[field].get(value, []) ]
return matches if matches else None

I would like to understand how the code is supposed to work so I can edit it to be able to search on multiple parameters.

original code:

from collections import defaultdict, namedtuple
import csv
class DataBase(object):
    def __init__(self, csv_filename, recordname):
        # read data from csv format file int list of named tuples
        with open(csv_filename, 'rb') as inputfile:
            csv_reader = csv.reader(inputfile, delimiter=',')
            self.fields = csv_reader.next() # read header row
            self.Record = namedtuple(recordname, self.fields)
            self.records = [self.Record(*row) for row in csv_reader]
            self.valid_fieldnames = set(self.fields)
        # create an empty table of lookup tables for each field name that maps
        # each unique field value to a list of record-list indices of the ones
        # that contain it.
        self.lookup_tables = defaultdict(lambda: defaultdict(list))

    def retrieve(self, **kwargs):
        """Fetch a list of records with a field name with the value supplied
           as a keyword arg ( or return None if there aren't any)."""

        if len(kwargs) != 1:
            raise ValueError(
            'Exactly one fieldname/keyword argument required for function '
            '(%s specified)' % ', '.join([repr(k) for k in kwargs.keys()])
            )

        field, value = kwargs.items()[0]        # get only keyword arg and value
        if field not in self.valid_fieldnames:
            raise ValueError('keyword arg "%s" isn\'t a valid field name' % field)
        if field not in self.lookup_tables:     # must create field look up table
            for index, record in enumerate(self.records):
                value = getattr(record, field)
                self.lookup_tables[field][value].append(index)

        matches = [self.records[index]
                   for index in self.lookup_tables[field].get(value, []) ]
        return matches if matches else None


if __name__ == '__main__':
    empdb = DataBase('employee.csv', 'Person')
    print "retrieve(name='Ted Kingston'):", empdb.retrieve(name='Ted Kingston')
    print "retrieve(age='27'):", empdb.retrieve(age='27')
    print "retrieve(weight='150'):", empdb.retrieve(weight='150')
Community
  • 1
  • 1
salatwork
  • 15
  • 3

1 Answers1

0

The variable value is overwritten in the following if .. for .. block:

field, value = kwargs.items()[0]   # <--- `value` defined

...

if field not in self.lookup_tables:
    for index, record in enumerate(self.records):
        value = getattr(record, field)  # <--- `value` overwritten
        self.lookup_tables[field][value].append(index)

So, value refers the value of the last record. You need to use another name to prevent such overwriting.

if field not in self.lookup_tables:
    for index, record in enumerate(self.records):
        v = getattr(record, field)
        self.lookup_tables[field][v].append(index)
falsetru
  • 357,413
  • 63
  • 732
  • 636