0

I'm trying to create a simple representation for an entity deduplication schema using mysql, and using sqlalchemy for programmatic access.

I'm trying to achieve a specific effect which I think is kind of a self-referential query but i'm not sure:

Essentially I have an 'entities' table (with unique entity_id) and an associated Entity object, and then an entity_groups table which (for simplicity) has a 'group_id' and 'entity_id' columns, so that I 'register' an entity with a group by creating a row for that relation. this table too is associated with an ORM object - EntityGroup.

Question is, how do i get the EntityGroup object reference all entities in the group?

I expect I need to write something like:

mapper(EntityGroup, entity_groups_table, properties={ 'entities': relationship( Entity, .... ? ) },

and i'm alittle fuzzy on the details. Basically I need all the rows in entity_groups that have the same group_id as the row represented by the object. And then I need to materialize all the Entity objects associated those rows' entity_id column. This sounds like something achievable by a more verbose Query() operation in sqlalchemy, but i'm not sure how to combine that with the relationship() construct (if at all - perhaps go manual? )

Any help will be useful, I hope I was clear and to the point

Leaurus
  • 376
  • 3
  • 13
Jim
  • 1

1 Answers1

0

You really should not do it using a Query, as if you configure the relationships properly you will get this automatically. Assuming that you use entity_group table solely to store the relationship and nothing else, you should just configure many-to-many relationship as documented. Fully working example should help:

from sqlalchemy import create_engine, Column, Integer, String, MetaData, ForeignKey, Table
from sqlalchemy.orm import relationship, mapper, scoped_session, sessionmaker, backref
from sqlalchemy.ext.associationproxy import association_proxy

# Configure test DB
engine = create_engine(u'sqlite:///:memory:', echo=False)
session = scoped_session(sessionmaker(bind=engine, autoflush=False))
metadata = MetaData()

# tables
entities_table = Table('entities', metadata,
    Column('entity_id', Integer, primary_key=True),
)

groups_table = Table('groups', metadata,
    Column('group_id', Integer, primary_key=True),
)

entity_groups_table = Table('entity_groups', metadata,
    Column('entity_id', Integer, ForeignKey('entities.entity_id'), primary_key=True),
    Column('group_id', Integer, ForeignKey('groups.group_id'), primary_key=True),
)

# object model
class Group(object):
    def __repr__(self): return "<Group: %d>" % (self.group_id,)
class Entity(object):
    def __repr__(self): return "<Entity: %d>" % (self.entity_id,)

# mappers
mapper(Group, groups_table)
mapper(Entity, entities_table, 
       properties={'groups': relationship(Group, secondary=entity_groups_table, backref='entities')},
       )

# create db schema
metadata.create_all(engine)

# == TESTS
# create entities
e1 = Entity()
e2 = Entity()
g1 = Group()
g2 = Group()
g3 = Group()
g1.entities.append(e1)
g2.entities.append(e2)
g3.entities.append(e1)
g3.entities.append(e2)
session.add(e1)
session.add(e2)
session.commit()

# query...
session.expunge_all()

# check Peter
for g in session.query(Group).all():
    print "group: ", g, " has ", g.entities

should produce something like:

group:  <Group: 1>  has  [<Entity: 1>]
group:  <Group: 2>  has  [<Entity: 1>, <Entity: 2>]
group:  <Group: 3>  has  [<Entity: 2>]
van
  • 74,297
  • 13
  • 168
  • 171