12

I'm designing a small GUI application to wrap an sqlite DB (simple CRUD operations). I have created three sqlalchemy models (m_person, m_card.py, m_loan.py, all in a /models folder) and had previously had the following code at the top of each one:

from sqlalchemy import Table, Column, create_engine
from sqlalchemy import Integer, ForeignKey, String, Unicode
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import backref, relation

engine = create_engine("sqlite:///devdata.db", echo=True)
declarative_base = declarative_base(engine)
metadata = declarative_base.metadata

This felt a bit wrong (DRY) so it was suggested that I move all this stuff out to the module level (into models/__init__.py).

from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy import Table, Column, Boolean, Unicode
from settings import setup

engine = create_engine('sqlite:///' + setup.get_db_path(), echo=False)
declarative_base = declarative_base(engine)
metadata = declarative_base.metadata

session = sessionmaker()
session = session()

..and import declarative_base like so:

from sqlalchemy import Table, Column, Unicode
from models import declarative_base


class Person(declarative_base):
    """
    Person model

    """
    __tablename__ = "people"

    id = Column(Unicode(50), primary_key=True)
    fname = Column(Unicode(50))
    sname = Column(Unicode(50))

However I've had a lot of feedback that executing code as the module imports like this is bad? I'm looking for a definitive answer on the right way to do it as it seems like by trying to remove code repetition I've introduced some other bad practises. Any feedback would be really useful.

(Below is the get_db_path() method from settings/setup.py for completeness as it is called in the above models/__init__.py code.)

def get_db_path():

    import sys
    from os import makedirs
    from os.path import join, dirname, exists
    from constants import DB_FILE, DB_FOLDER

    if len(sys.argv) > 1:
        db_path = sys.argv[1]
    else:
        #Check if application is running from exe or .py and adjust db path accordingly
        if getattr(sys, 'frozen', False):
            application_path = dirname(sys.executable)
            db_path = join(application_path, DB_FOLDER, DB_FILE)
        elif __file__:
            application_path = dirname(__file__)
            db_path = join(application_path, '..', DB_FOLDER, DB_FILE)

    #Check db path exists and create it if not
    def ensure_directory(db_path):
        dir = dirname(db_path)
        if not exists(dir):
            makedirs(dir)

    ensure_directory(db_path)

    return db_path
CharlesB
  • 86,532
  • 28
  • 194
  • 218
johnharris85
  • 17,264
  • 5
  • 48
  • 52
  • 1
    My 2c, while it's generally OK to put init code in `__init__.py` (that's what it's there for!), going so far as to set the database URL there seems wrong to me. It might make testing harder. It seems to me that what you should do is have a `init()` function for the module. – cha0site Feb 28 '12 at 17:09
  • I think the problem here is that the base class `declarative_base` is created dynamically and seems to depend on the database engine. I don't know how sqlalchemy works, but is it not possible to define the model classes before making the DB connection?? I can hardly believe that. – Niklas B. Feb 28 '12 at 17:34
  • @NiklasB. The `declarative_base` depends on the "metadata" which is basically the core sqlalchemy engine. It does not necessarily have any ties to the database, though you can choose to give it one for simplicity. – wberry Feb 28 '12 at 23:08
  • @wberry, OP: So I guess a clean approach would be to NOT give it a DB connection and leave that to the user of the API. – Niklas B. Feb 28 '12 at 23:12
  • I think this question is valuable for non-Python programmers, too. Particularly for Node and Ruby, but also a bit less so for PHP. – leemeichin Feb 29 '12 at 01:19

4 Answers4

5

Some popular frameworks (Twisted is one example) perform a good deal of initialization logic at import-time. The benefits of being able to build module contents dynamically do come at a price, one of them being that IDEs cannot always decide what is "in" the module or not.

In your specific case, you may want to refactor so that the particular engine is not supplied at import-time but later. You can create the metadata and your declarative_base class at import-time. Then during start-time, after all the classes are defined, you call create_engine and bind the result to your sqlalchemy.orm.sessionmaker. But if your needs are simple, you may not even need to go that far.

In general, I would say this is not Java or C. There is no reason to fear doing things at the module level other than define functions, classes and constants. Your classes are created when the application starts anyway, one after the other. A little monkey-patching of classes (in the same module!), or creating one or two global lookup tables, is OK in my opinion if it simplifies your implementation.

What I would definitely avoid is any code in your module that causes the order of imports to matter for your users (other than the normal way of simply providing logic to be used), or that modifies the behavior of code outside your module. Then your module becomes black magic, which is accepted in the Perl world (ala use strict;), but I find is not "pythonic".

For example, if your module modifies properties of sys.stdout when imported, I would argue that behavior should instead be moved into a function that the user can either call or not.

wberry
  • 18,519
  • 8
  • 53
  • 85
4

In principle there is nothing wrong with executing Python code when a module is imported, in fact every single Python module works that way. Defining module members is executing code, after all.

However, in your particular use case I would strongly advise against making a singleton database session object in your code base. You'll be losing out on the ability to do many things, for example unit test your model against an in-memory SQLite or other kind of database engine.

Take a look at the documentation for declarative_base and note how the examples are able to create the model with a declarative_base supplied class that's not yet bound to a database engine. Ideally you want to do that and then have some kind of connection function or class that will manage creating a session and then bind it to base.

Kamil Kisiel
  • 19,723
  • 11
  • 46
  • 56
  • If I understand correctly, you're advising not to create a session object and then just pass it around, but instead to create a new session each time I need to use it? – johnharris85 Feb 29 '12 at 11:40
  • No, you can create one and pass it around, just don't create it at the module level. You can have a class returned by `sessionmaker` or one of its wrappers at the module level, but don't have any module-level intances of that class. – Kamil Kisiel Feb 29 '12 at 15:48
3

There is nothing wrong with executing code at import time.

The guideline is executing enough so that your module is usable, but not so much that importing it is unnecessarily slow, and not so much that you are unnecessarily limiting what can be done with it.

Typically this means defining classes, functions, and global names (actually module level -- bit of a misnomer there), as well as importing anything your classes, functions, etc., need to operate.

This does not usually involve making connections to databases, websites, or other external, dynamic resources, but rather supplying a function to establish those connections when the module user is ready to do so.

Ethan Furman
  • 63,992
  • 20
  • 159
  • 237
2

I ran into this as well, and created a database.py file with a database manager class, after which I created a single global object. That way the class could read settings from my settings.py file to configure the database, and the first class that needed to use a base object (or session / engine) would initialize the global object, after which everyone just re-uses it. I feel much more comfortable having "from myproject.database import DM" at the top of each class using the SQLAlchemy ORM, where DM is my global database object, and then DM.getBase() to get the base object.

Here is my database.py class:

Session = scoped_session(sessionmaker(autoflush=True))

class DatabaseManager():
    """
    The top level database manager used by all the SQLAlchemy classes to fetch their session / declarative base.
    """
    engine = None
    base = None

    def ready(self):
        """Determines if the SQLAlchemy engine and base have been set, and if not, initializes them."""
        host='<database connection details>'
        if self.engine and self.base:
            return True
        else:
            try:
                self.engine = create_engine(host, pool_recycle=3600)
                self.base = declarative_base(bind=self.engine)
                return True
            except:
                return False

    def getSession(self):
        """Returns the active SQLAlchemy session."""
        if self.ready():
            session = Session()
            session.configure(bind=self.engine)
            return session
        else:
            return None

    def getBase(self):
        """Returns the active SQLAlchemy base."""
        if self.ready():
            return self.base
        else:
            return None

    def getEngine(self):
        """Returns the active SQLAlchemy engine."""
        if self.ready():
            return self.engine
        else:
            return None

DM = DatabaseManager()
Adam Morris
  • 8,265
  • 12
  • 45
  • 68