I'm querying my SQLAlchemy-mapped star schema directly into a pandas DataFrame
and am getting an annoying SAWarning from pandas
that I'd like to address. Here's a simplified version.
class School(Base):
__tablename__ = 'DimSchool'
id = Column('SchoolKey', Integer, primary_key=True)
name = Column('SchoolName', String)
district = Column('SchoolDistrict', String)
class StudentScore(Base):
__tablename__ = 'FactStudentScore'
StudentKey = Column('StudentKey', Integer, ForeignKey('DimStudent.StudentKey'), primary_key = True)
SchoolKey = Column('SchoolKey', Integer, ForeignKey('DimSchool.SchoolKey'), primary_key = True)
PointsPossible = Column('PointsPossible', Integer)
PointsReceived = Column('PointsReceived', Integer)
student = relationship("Student", backref='studentscore')
school = relationship("School", backref='studentscore')
I query the date with statements like this:
standard = session.query(StudentdScore, School).\
join(School).filter(School.name.like('%Dever%'))
testdf = pd.read_sql(sch.statement, sch.session.bind)
And then get this warning:
SAWarning: Column 'SchoolKey' on table <sqlalchemy.sql.selectable.Select at 0x1ab7abe0; Select object> being replaced by Column('SchoolKey', Integer(), table=<Select object>, primary_key=True, nullable=False), which has the same key. Consider use_labels for select() statements.
I get this error for every additional table (class) included in my join. The message always refers to the foreign key.
Anyone else encounter this error and determine root cause? Or have ya'll just been ignoring it as well?
EDIT/UPDATE:
Handling Duplicate Columns in Pandas DataFrame constructor from SQLAlchemy Join
These guys seem to be talking about a related issue, but they use a different pandas method to bring the dataframe in and want to keep duplicates, not drop them. Anyone have thoughts on how to implement a similar styled function, but drop the duplicates as the query comes back?