4

We have certain linux devices which send data like battery percentage, cpu utilization, ram utilization, etc. in certain intervals. We want to run analytics for this data. Should we capture this data in mongo(https://www.mongodb.com/blog/post/time-series-data-and-mongodb-part-1-introduction) or use a specific timeseries database like influxdb or TSDB? The data generated is around 100 GB per day and we want it for last 3 months.

plr108
  • 1,201
  • 11
  • 16
suraj shukla
  • 106
  • 1
  • 6

3 Answers3

2

TSDB bencmarks show (TimescaleDB vs MongoDB, InfluxDB vs MongoDB) that dedicated timeseries databases outperform MongoDB. At 100 GB per day x 3 months on-disk data compression is also important. VictoriaMetrics seems to be leading in ingestion rate, query speed and compression for typical use cases although TimescaleDB has recently improved data compression. And have a look at Yandex ClickHouse benchmarks too.

Yuri Lachin
  • 1,470
  • 7
  • 7
1

For another alternative, check out QuestDB at Questdb.io. QuestDB outperforms all of the above mentioned TSDBs and is SQL-based.

You can try it out for speed at http://try.questdb.io:9000/ which is a live instance loaded with 1.9B rows of data from the NYC Taxi dataset.

Davidgs
  • 411
  • 6
  • 18
0

For timeseries data, it's highly recommended to use timeseries database instead of RDBMS or NoSQL DB because the storage and query are optimized for timeseries data in TSDB.

Here I want to recommend a lightweight, high performance, open source time series database, TDengine. TDengine is a distributed TSDB and its distributed solution is also open source, it also supports SQL for easy use.

https://tdengine.com/