0

I have a situation where a third party stores data in daily tables, where if the record count exceeds two million a subsequent table is created, and so on, named [date]_x.

Now, we have a reporting requirement and need to consume this data. Manual UNION SQL and other ETL operations have been performed in the past, which I am trying to automate.

My first feeling was to throw everything into a data lake and map-reduce on AWS. However, looking at Tableau, I was hoping to make use of some of its automation to expedite the solution. Now, I'm not sure this is the best solution.

Suggestions please?

ElHaix
  • 12,846
  • 27
  • 115
  • 203
  • Is there a performance problem with the manual UNION SQL? Or is it just hard to generate? What MySQL version? There may be ways to migrate the tables into partitions to simplify the SQL. Did you want to try that before looking at other solutions? This is looking like a more general question at the moment. – danblack Feb 05 '20 at 23:05
  • Your first sentence seems to be the problem: "third party stores data in daily tables" – Rick James Feb 06 '20 at 01:02

1 Answers1

0

Assuming you won't replace the software that is generating "daily tables", let's devise an alternative.

I assume those tables are something like a Data Warehouse "Fact" table. That is, a large time-oriented set of data. Correct? (Please provide SHOW CREATE TABLE for one of them.)

And your reports need to digest and summarize that data in multiple different ways?

First, summarize each of the daily tables. Then build the reports from the summary tables. This involves designing and building a few "Summary tables". Initially summarize each daily Fact table into them. After that, it is a matter of taking each new Fact table (nightly?) and summarizing it into the summary tables.

http://mysql.rjweb.org/doc.php/summarytables

Rick James
  • 135,179
  • 13
  • 127
  • 222