1

I am working on a project that uses TimescaleDB as a database storage for data that is about 6 TB in size. It is setup as an instance on AWS EC2.

If I understand clearly, TimescaleDB has the concept of Hypertables which basically performs chunking behind the scenes to emulate distributed environment.

I wanted to know if it is possible to create a distributed environment, possibly using 3 instances as a cluster, and splitting data storage across these three nodes, so 6 TB is distributed as 2 TB on each instance.

Is this is something that is possible on the current version (1.7.2)?

Alpha Bing
  • 31
  • 3
  • I don't think so. TimescaleDB uses PostgreSQL partitioning and is not a sharding solution. But someone with more TimescaleDB fu may correct me. – Laurenz Albe Oct 05 '20 at 15:19
  • Do you think I can perform partitioning given that it already works out Hypertables? – Alpha Bing Oct 05 '20 at 16:32
  • Partitioning is splitting a table into several smaller tables *in a single database*. Sharding is splitting up data across several databases. – Laurenz Albe Oct 05 '20 at 16:45

1 Answers1

1

TimescaleDB 2.0 will support "horizontal sharding" of data, as you suggest, as part of its distributed hypertables.

We expect 2.0-RC1 to be out this week, and each release candidate will be "fully upgradeable" to the final 2.0 release.

For detailed information about TimescaleDB multi-node and distributed hypertables, please see this blog post last year (since then, we've had six beta releases, starting last fall).

Mike Freedman
  • 1,692
  • 9
  • 9
  • Thank you for the information. Is it for now, not possible to have a multi-node system setup with 1.7 version? – Alpha Bing Oct 05 '20 at 19:06
  • @alpha-bing The multi-node support built into 2.0 is not available in 1.7.2. You could build a multi-node system "manually" using foreign tables in 1.7, I guess, but that is quite a lot of work. – Mats Kindahl Oct 07 '20 at 05:46