0

I'm dealing with timeseries data, and I'm trying to avoid having a many-to-one-relashionship on every row of a single record table where all records are mixed together (since the series all have different number of columns).

Instead I want to split the records between multiple tables to keep individual data series apart.

I have a lot of these tables:

# series 1 record tables:
CREATE TABLE records_1 (
    id BIGSERIAL PRIMARY KEY,
    level DOUBLE PRECISION
    #... more columns
);
CREATE TABLE records_2 (
    id BIGSERIAL PRIMARY KEY,
    level DOUBLE PRECISION
    #... more columns
);
CREATE TABLE records_34 (
    id BIGSERIAL PRIMARY KEY,
    level DOUBLE PRECISION
    #... more columns
);
# series 2 record tables
CREATE TABLE records_101 (
    id BIGSERIAL PRIMARY KEY,
    level DOUBLE PRECISION,
    height DOUBLE PRECISION
    #... more columns
);
CREATE TABLE records_102 (
    id BIGSERIAL PRIMARY KEY,
    level DOUBLE PRECISION,
    height DOUBLE PRECISION
    #... more columns
);

And then I have a class with a string representing the table names associated with that class:

class Serie(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    record_tables = db.Column(db.String(255))

Example:

s1 = Serie() 
s1.record_tables = 'records_1, records_2, records_34'
s2 = Serie()
s2.record_tables = 'records_101, records_102'

I'd like to be able to select all records associated with a given serie, based on the record_tables string. Splitting and stripping it of whitespace is easy, but how would I build the query dynamically?

Something like "for each table in record_tables_list, select all records from table and return all records from all tables in list" but in one sql query instead of appending results to a list in python (which I imagine is slower).

I guess I should use UNION, as I have recreated such a query manually in SQL. But how do do it with an unknown number of table names in Serie.record_tables?

Update:

Ok, so a raw query seems to work while joining the list of sanitized record_tables (it depends on user input (my own) so I have to clean it up to remove typos and such):

tables = ', '.join([x.strip() for x in record_tables.split(',')])
raw_query = 'SELECT * FROM ' + tables
results = db.engine.execute(raw_query)
for row in raw_query:
    print row    # (1L, 123.0, 1L, 456.0)

Edit: no this doesn't work. hm.

pjgb
  • 15
  • 4
  • Why don't you have all the data just in the one table, that is how PostgreSQL is supposed to be used. Code the serie in a VARCHAR column, with `'1'`, `'2'`, `'2a'` etc – Antti Haapala -- Слава Україні Feb 16 '15 at 19:51
  • Thanks Antti, I have considered that (but using a relationship back to the series instead of a string representation). It just seems so unnecessary to have all that "extra" data in a column instead of a separate table...? – pjgb Feb 16 '15 at 19:58
  • Also, the series have different columns. Not sure how I would be able to do that in a single table without having lots of generic column names (col1_name, col1_val, col2_name, col2_val etc.) – pjgb Feb 16 '15 at 20:08

1 Answers1

0
class Serie(Base):
    __tablename__ = 'serie'
    id = Column(Integer, primary_key=True)
    record_tables = Column(String(255))

    @property
    def get_records(self):
        sess = object_session(self)
        tables = [x.strip() for x in self.record_tables.split(',')]
        raw_query = '\nUNION ALL\n'.join(
            'SELECT * FROM ' + table for table in tables)
        results = sess.execute(raw_query).fetchall()
        return results

This assumes that all the record tables configured in the Serie have the same columns and columns order is the same.

van
  • 74,297
  • 13
  • 168
  • 171
  • OP stated that they didn't have same columns, and thus not necessarily in the same order either – Antti Haapala -- Слава Україні Feb 17 '15 at 06:27
  • This is not how I understand it: different series have different columns, but one series has data stored in multiple `records_xx` tables with the *same* structure. – van Feb 17 '15 at 07:07
  • @van you are correct, on both the table formats and the solution. Thank you very much, I had no idea it was so easy to union multiple queries. – pjgb Feb 17 '15 at 16:40