25

In order to handle a growing database table, we are sharding on table name. So we could have database tables that are named like this:

table_md5one
table_md5two
table_md5three

All tables have the exact same schema.

How do we use SQLAlchemy and dynamically specify the tablename for the class that corresponds to this? Looks like the declarative_base() classes need to have tablename pre-specified.

There will eventually be too many tables to manually specify derived classes from a parent/base class. We want to be able to build a class that can have the tablename set up dynamically (maybe passed as a parameter to a function.)

Suman
  • 9,221
  • 5
  • 49
  • 62

8 Answers8

18

OK, we went with the custom SQLAlchemy declaration rather than the declarative one.

So we create a dynamic table object like this:

from sqlalchemy import MetaData, Table, Column

def get_table_object(self, md5hash):
    metadata = MetaData()
    table_name = 'table_' + md5hash
    table_object = Table(table_name, metadata,
        Column('Column1', DATE, nullable=False),
        Column('Column2', DATE, nullable=False)
    )
    clear_mappers()
    mapper(ActualTableObject, table_object)
    return ActualTableObject

Where ActualTableObject is the class mapping to the table.

Suman
  • 9,221
  • 5
  • 49
  • 62
  • This is what I've done in the past, and I'd also like to see a different solution. – jkmacc Oct 03 '13 at 22:24
  • @jkmacc @Suman I have added a solution with a custom `Base` class which might be helpful, depending on the use case. – javex Oct 04 '13 at 12:15
  • @jkmacc try my solution. – lifei Jul 10 '15 at 03:43
  • Is there any builtin solution for sharding tables? – Ryan Chou Oct 30 '17 at 06:07
  • 1
    from the docs: "clear_mappers() is not for normal use, as there is literally no valid usage for it outside of very specific testing scenarios." https://docs.sqlalchemy.org/en/13/orm/mapping_api.html#sqlalchemy.orm.clear_mappers – knopch1425 Dec 24 '19 at 16:37
15

In Augmenting the Base you find a way of using a custom Base class that can, for example, calculate the __tablename__ attribure dynamically:

class Base(object):
    @declared_attr
    def __tablename__(cls):
        return cls.__name__.lower()

The only problem here is that I don't know where your hash comes from, but this should give a good starting point.

If you require this algorithm not for all your tables but only for one you could just use the declared_attr on the table you are interested in sharding.

bstovall
  • 101
  • 1
  • 8
javex
  • 7,198
  • 7
  • 41
  • 60
  • It is incorrect and throws `InvalidRequestError` if you try to do this on model definition rather than declarative base class. – kirpit Feb 15 '18 at 15:14
  • Added an edit, but just in case the edit doesn't get accepted, the updated URL for Augmenting the Base is no longer valid. Section is now available here: [Augmenting the Base](https://docs.sqlalchemy.org/en/13/orm/extensions/declarative/mixins.html#augmenting-the-base) – bstovall Mar 07 '21 at 04:02
8

you can write a function with tablename parameter and send back the class with setting appropriate attributes.

def get_class(table_name):

   class GenericTable(Base):

       __tablename__ = table_name

       ID= Column(types.Integer, primary_key=True)
       def funcation(self):
        ......
   return GenericTable

Then you can create a table using:

get_class("test").__table__.create(bind=engine)  # See sqlachemy.engine
Dvir Arad
  • 147
  • 1
  • 9
8

Because I insist to use declarative classes with their __tablename__ dynamically specified by given parameter, after days of failing with other solutions and hours of studying SQLAlchemy internals, I come up with the following solution that I believe is simple, elegant and race-condition free.

def get_model(suffix):
    DynamicBase = declarative_base(class_registry=dict())

    class MyModel(DynamicBase):
        __tablename__ = 'table_{suffix}'.format(suffix=suffix)

        id = Column(Integer, primary_key=True)
        name = Column(String)
        ...

    return MyModel

Since they have their own class_registry, you will not get that warning saying:

This declarative base already contains a class with the same class name and module name as mypackage.models.MyModel, and will be replaced in the string-lookup table.

Hence, you will not be able to reference them from other models with string lookup. However, it works perfectly fine to use these on-the-fly declared models for foreign keys as well:

ParentModel1 = get_model(123)
ParentModel2 = get_model(456)

class MyChildModel(BaseModel):
    __tablename__ = 'table_child'

    id = Column(Integer, primary_key=True)
    name = Column(String)
    parent_1_id = Column(Integer, ForeignKey(ParentModel1.id))
    parent_2_id = Column(Integer, ForeignKey(ParentModel2.id))
    parent_1 = relationship(ParentModel1)
    parent_2 = relationship(ParentModel2)

If you only use them to query/insert/update/delete without any reference left such as foreign key reference from another table, they, their base classes and also their class_registry will be garbage collected, so no trace will be left.

kirpit
  • 4,419
  • 1
  • 30
  • 32
4

Try this

import zlib

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, BigInteger, DateTime, String

from datetime import datetime

BASE = declarative_base()
ENTITY_CLASS_DICT = {}


class AbsShardingClass(BASE):

    __abstract__ = True

def get_class_name_and_table_name(hashid):
    return 'ShardingClass%s' % hashid, 'sharding_class_%s' % hashid

def get_sharding_entity_class(hashid):
    """
    @param hashid: hashid
    @type hashid: int
    @rtype AbsClientUserAuth
    """

    if hashid not in ENTITY_CLASS_DICT:
        class_name, table_name = get_class_name_and_table_name(hashid)
        cls = type(class_name, (AbsShardingClass,),
                   {'__tablename__': table_name})
        ENTITY_CLASS_DICT[hashid] = cls

    return ENTITY_CLASS_DICT[hashid]

cls = get_sharding_entity_class(1)
print session.query(cls).get(100)
lifei
  • 235
  • 2
  • 10
0

Instead of using imperative creating Table object, you can use usual declarative_base and make a closure to set a table name as the following:

def make_class(Base, table_name):
    class User(Base):
        __tablename__ = table_name
        id = Column(Integer, primary_key=True)
        name= Column(String)

    return User

Base = declarative_base()
engine = make_engine()
custom_named_usertable = make_class(Base, 'custom_name')
Base.metadata.create_all(engine)

session = make_session(engine)
new_user = custom_named_usertable(name='Adam')
session.add(new_user)
session.commit()
session.close()
engine.dispose()
0

There is a way to do it with declarative base. You can directly modify the Table opbject mapped to the class.

class MyTableClass(Base)
  __tablename__ = "table"

  @classmethod
  def remap(cls,md5):
    cls.__tablename__ = f"table_{md5}"
    existing_mapper = class_mapper(cls)
    existing_mapper.mapped_table.name = f"table_{md5}"
    existing_mapper.local_table.name = f"table_{md5}"

  #other class methods

Now call the remap function before querying.

MyTableClass.remap(md5)
-1

just you need to create class object for Base.

from sqlalchemy.ext.declarative import declarative_base, declared_attr

class Base(object):
    @declared_attr
    def __tablename__(cls):
        return cls.__name.lower()

Base = declarative_base(cls=Base) 
jak bin
  • 380
  • 4
  • 8