I have a strange issue that I simply cannot resolve. Essentially, I have a model and system which works perfectly - except under a very specific (and seemingly arbitrary) set of circumstances.
I'll paste the model in a second but here's the idea. I want certain tables to be versioned. That means for a given table, I break it into two tables, the Master part which has the natural keys for the object, and the Version table which has all the associated data which may change. Then some of my models of course have a relationship, so I create a join table that links versions.
Here are the models:
class Versioned(object):
def __init__(self, **kwargs):
super(Versioned, self).__init__(**kwargs)
self.active = True
self.created_on = datetime.datetime.now()
active = Column(BOOLEAN)
created_on = Column(TIMESTAMP, server_default=func.now())
def __eq__(self, other):
return self.__class__ == other.__class__ and \
all([getattr(self, key) == getattr(other, key)
for key in self.comparison_keys
])
def __ne__(self, other):
return not self.__eq__(other)
comparison_keys = []
class Parent(Base):
__tablename__ = 'parent'
id = Column(INTEGER, primary_key=True)
name = Column(TEXT)
versions = relationship("ParentVersion", back_populates="master")
children = relationship("Child", back_populates="parent")
@property
def current_version(self):
active_versions = [v for v in self.versions if v.active==True]
return active_versions[0] if active_versions else None
class ParentVersion(Versioned, Base):
__tablename__ = 'parent_version'
id = Column(INTEGER, primary_key=True)
master_id = Column(INTEGER, ForeignKey(Parent.id))
address = Column(TEXT)
master = relationship("Parent", back_populates="versions")
children = relationship("ChildVersion",
secondary=lambda : Parent_Child.__table__
)
class Child(Base):
__tablename__ = 'child'
id = Column(INTEGER, primary_key=True)
parent_id = Column(INTEGER, ForeignKey(Parent.id))
name = Column(TEXT)
versions = relationship("ChildVersion", back_populates="master")
parent = relationship("Parent", back_populates="children")
@property
def current_version(self):
active_versions = [v for v in self.versions if v.active==True]
return active_versions[0] if active_versions else None
class ChildVersion(Versioned, Base):
__tablename__ = 'child_version'
id = Column(INTEGER, primary_key=True)
master_id = Column(INTEGER, ForeignKey(Child.id))
age = Column(INTEGER)
fav_toy = Column(TEXT)
master = relationship("Child", back_populates="versions")
parents = relationship("ParentVersion",
secondary=lambda: Parent_Child.__table__,
)
comparison_keys = [
'age',
'fav_toy',
]
class Parent_Child(Base):
__tablename__ = 'parent_child'
id = Column(INTEGER, primary_key=True)
parent_id = Column(INTEGER, ForeignKey(ParentVersion.id))
child_id = Column(INTEGER, ForeignKey(ChildVersion.id))
Okay, so I know the more recent SQLAlchemy models have some idea of versioning, it's possible that I'm doing this the wrong way. But this fits my use case very well. So humor me and let's assume the model is okay (in the general sense - if there's a minor detail causing the bug that would be good to fix)
Now suppose I want to insert data. I have data from some source, I take it in and build models. Ie, split things into Master/Version, assign the child relationships, assign the version relationships. Now I want to compare it against the data already in my database. For each master object, if I find it, I compare the versions. If the versions are different, you create a new version. The tricky part becomes, if a Child version is different, I want to insert a new Parent version, and update all its relationships. Maybe the code makes more sense to explain this part. search_parent
is the object I have created in my pre-parsing stage. It has a version, and children objects, which also have versions.
parent_conds = [
getattr(search_parent.__class__, name) == getattr(search_parent, name)
for name, column in search_parent.__class__.__mapper__.columns.items()
if not column.primary_key
]
parent_match = session.query(Parent).filter(*parent_conds).first()
# We are going to make a new version
parent_match.current_version.active=False
parent_match.versions.append(search_parent.current_version)
for search_child in search_parent.children[:]:
search_child.parent_id = parent_match.id
search_conds = [
getattr(search_child.__class__, name) == getattr(search_child, name)
for name, column in search_child.__class__.__mapper__.columns.items()
if not column.primary_key
]
child_match = session.query(Child).filter(*search_conds).first()
if child_match.current_version != search_child.current_version:
# create a new version: deactivate the old one, insert the new
child_match.current_version.active=False
child_match.versions.append(search_child.current_version)
else:
# copy the old version to point to the new parent version
children = parent_match.current_version.children
children.append(child_match.current_version)
children.remove(search_child.current_version)
session.expunge(search_child.current_version)
session.expunge(search_child)
session.expunge(search_parent)
session.add(parent_match)
session.commit()
Okay, so once again, this might not be the perfect or even best approach. But it does work. EXCEPT, and this is what I can't figure out. It doesn't work if I'm updating the child's age attribute to the integer value zero. If the child objects start with age 0, and I change it to something else, this works beautifully. If I start with some non-zero integer, and update the age to 0, I get this warning:
SAWarning: Object of type <ChildVersion> not in session, add operation along 'ParentVersion.children' won't proceed (mapperutil.state_class_str(child), operation, self.prop))
The updated version is inserted, however the insert into the parent_child join table doesn't happen. And it's not that it fails, it's that SQLAlchemy has determined the child object doesn't exist and can't create the join. But it does exist, I know it gets inserted.
Again, this only happens if I'm inserting a new version with age=0. If I'm inserting a new version with any other age, this works exactly as I want it to.
There are other odd things about the bug - it doesn't happen if you don't insert enough children (seems to be around 12 triggers the bug), it doesn't happen depending on other attributes sometimes. I don't think I fully understand the surface area of what causes it.
Thanks for taking the time to read this far. I have a fully working demo complete with source data I'd be happy to share, it just requires some setup so I didn't know if it was appropriate in this post. I hope someone has ideas for what to look at because at this point I'm totally out.
edit: Here is the full stack trace leading to the warning.
File "repro.py", line 313, in <module>
load_data(session, second_run)
File "repro.py", line 293, in load_data
session.commit()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 801, in commit
self.transaction.commit()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 392, in commit
self._prepare_impl()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 372, in _prepare_impl
self.session.flush()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2019, in flush
self._flush(objects)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2101, in _flush
flush_context.execute()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
rec.execute(self)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 487, in execute
self.dependency_processor.process_saves(uow, states)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1053, in process_saves
False, uowcommit, "add"):
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1154, in _synchronize
(mapperutil.state_class_str(child), operation, self.prop))
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 1297, in warn
warnings.warn(msg, exc.SAWarning, stacklevel=2)
File "repro.py", line 10, in warn_with_traceback
traceback.print_stack()
/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py:1154: SAWarning: Object of type <ChildVersion> not in session, add operation along 'ParentVersion.children' won't proceed
(mapperutil.state_class_str(child), operation, self.prop))
edit2: Here is a gist with a python file you can run to see the strange behavior. https://gist.github.com/jbouricius/2ede420fb1f7a2deec9f557c76ced7f9