3

I understand what a class is, a bundle of attributes and methods stored together in one object. However, i don't think i have ever really grasped their full power. I taught myself to manipulate large volumes of data by using 'dictionary of dictionary' data structures. I'm now thinking if i want to fit in with the rest of the world then i need to implement classes in my code, but i just don't get how to make the transition.

I have a script which gets information about sales orders from a SQL query, performs operations on the data, and outputs it to a csv.

1) (the way i currently do it, store all the orders in a dictionary of dictionaries)

cursor.execute(querystring)

# create empty dictionary to hold orders
orders = {}

# define list of columns returned by query
columns = [col[0] for col in cursor.description]

for row in cursor:
    # create dictionary of dictionaries where key is order_id
    # this allows easy access of attributes given the order_id
    orders[row.order_id] = {}
    for i, v in enumerate(columns):
        # add each field to each order
        orders[row.order_id][v] = row[i]

# example operation
for order, fields in orders.iteritems():
    fields['long'], fields['lat'] = getLongLat(fields['post_code'])

# example of another operation
cancelled_orders = getCancelledOrders()
for order_id in cancelled_orders:
    orders[order_id]['status'] = 'cancelled'

# Other similar operations go here...

# write to file here...

2) (the way i THINK i would do it if i was using classes)

class salesOrder():


    def __init__(self, cursor_row):
        for i, v in enumerate(columns):
            setattr(self, v, cursor_row[i])


    def getLongLat(self, long_lat_dict):
        self.long, self.lat = long_lat_dict[self.post_code]['long'], long_lat_dict[self.post_code]['lat']


    def cancelOrder(self):
        self.status = 'cancelled'


    # more methods here


cursor.execute(querystring)

# create empty dictionary to hold orders
orders = {}

# define list of columns returned by query
columns = [col[0] for col in cursor.description]

for row in cursor:
    orders[row.order_id] = salesOrder(row)
    orders[row.order_id].getLongLat()

# example of another operation
cancelled_orders = getCancelledOrders()
for order_id in cancelled_orders:
    orders[order_id].cancelOrder()

# other similar operations go here

# write to file here

I just get the impression that i'm not quite understanding the best way to use classes. Have i got the complete wrong idea about how to use classes? Is there some sense to what i'm doing but it needs refactoring? or am i trying to use classes for the wrong purpose?

teebagz
  • 656
  • 1
  • 4
  • 26
  • 2
    Think of a class as nothing more than a dictionary with custom methods. – Bryan Oakley Sep 27 '16 at 13:49
  • (a) It seems your `row` is a class object the way you present it in the first example (e.g. `row.order_id`). (b) `row.column` will not work. (c) Let's back up a little bit, I think the reason you put your data into a dictionary is to look it up easily, then update. However, that's what your database is for, why not query the database and update directly? – Hai Vu Sep 27 '16 at 13:58
  • @HaiVu (a) the row is a class object, but it's just a row returned by the cursor, i understand how to use existing classes (mostly), i'm more concerned about the best way to implement my own classes (b) good spot on row.column, my bad, this is why i have used enumerate(columns) in the second example to get around this, i'll edit the question now (c) That would be a better approach in the given situation but i have simplified by problem for the purposes of the question and a pure db solution is not necessarily appropriate for my wider project (hope that makes sense) – teebagz Sep 27 '16 at 14:15
  • Well, part of it is that you could put your dbcol-attrib mapping in say method *dbset* in class DBInst. Then SalesOrder can be an subclass of that and call dbsetb from init. Presto, db-aware classes throughout. – JL Peyret Sep 27 '16 at 15:07

2 Answers2

2

Classes are mostly useful for coupling data to behaviour, and for providing structure (naming and documenting the association of certain properties, for example).

You're not doing either of those here - there's no real behaviour in your class (it doesn't do anything to the data), and all the structure is provided externally. The class instances are just used for their attribute dictionaries, so they're just a fancy wrapper around your old dictionary.

If you do add some real behaviour (above getLongLat and cancelOrder), or some real structure (other than a list of arbitrary column names and field values passed in from outside), then it makes sense to use a class.

Useless
  • 64,155
  • 6
  • 88
  • 132
1

I am trying to guess what you are trying to do since I have no idea what your "row" looks like. I assume you have the variable columns which is a list of column names. If that is the case, please consider this code snippet:

class SalesOrder(object):
    def __init__(self, columns, row):
        """ Transfer all the columns from row to this object """
        for name in columns:
            value = getattr(row, name)
            setattr(self, name, value)
        self.long, self.lat = getLongLat(self.post_code)

    def cancel(self):
        self.status = 'cancelled'

    def as_row(self):
        return [getattr(self, name) for name in columns]

    def __repr__(self):
        return repr(self.as_row())

# Create the dictionary of class
orders = {row.order_id: SalesOrder(columns, row) for row in cursor}

# Cancel
cancelled_orders = getCancelledOrders()
for order_id in cancelled_orders:
    orders[order_id].cancel()

# Print all sales orders
for sales_order in orders.itervalues():
    print(sales_order)

At the lowest level, we need to be able to create a new SalesOrder object from the row object by copying all the attributes listed in columns over. When initializing a SalesOrder object, we also calculate the longitude and latitude as well.

With that, the task of creating the dictionary of class objects become easier:

orders = {row.order_id: SalesOrder(columns, row) for row in cursor}

Our orders is a dictionary with order_id as keys and SalesOrder as values. Finally, the task up cancelling the orders is the same as your code.

In addition to what you have, I created a method called as_row() which is handy if later you wish to write a SalesOrder object into a CSV or database. For now, I use it to display the "raw" row. Normally, the print statement/function will invoke the __str__() method to get a string presentation for an object, if not found, it will attempt to invoke the __repr__() method, which is what we have here.

Hai Vu
  • 37,849
  • 11
  • 66
  • 93
  • Hi, this is very helpful. It gives me some ideas about how to better structure my code, it also assures me that the way i am trying to use multiple instances of the same class is a 'valid' use of classes. BTW the 'columns' variable is just a list of the fields i have selected in my sql query – teebagz Sep 27 '16 at 15:25
  • If you want to retain the ability to reference nested items by sequential brackets, see http://stackoverflow.com/questions/4014621/a-python-class-that-acts-like-dict – Kenny Ostrom Sep 27 '16 at 17:48