We have certain linux devices which send data like battery percentage, cpu utilization, ram utilization, etc. in certain intervals. We want to run analytics for this data. Should we capture this data in mongo(https://www.mongodb.com/blog/post/time-series-data-and-mongodb-part-1-introduction) or use a specific timeseries database like influxdb or TSDB? The data generated is around 100 GB per day and we want it for last 3 months.
3 Answers
TSDB bencmarks show (TimescaleDB vs MongoDB, InfluxDB vs MongoDB) that dedicated timeseries databases outperform MongoDB. At 100 GB per day x 3 months on-disk data compression is also important. VictoriaMetrics seems to be leading in ingestion rate, query speed and compression for typical use cases although TimescaleDB has recently improved data compression. And have a look at Yandex ClickHouse benchmarks too.

- 1,470
- 7
- 7
For another alternative, check out QuestDB at Questdb.io. QuestDB outperforms all of the above mentioned TSDBs and is SQL-based.
You can try it out for speed at http://try.questdb.io:9000/ which is a live instance loaded with 1.9B rows of data from the NYC Taxi dataset.

- 411
- 6
- 18
-
What is your experience with it from scalability maintainability (hassle to self-host)? – guyromb Sep 17 '21 at 11:17
For timeseries data, it's highly recommended to use timeseries database instead of RDBMS or NoSQL DB because the storage and query are optimized for timeseries data in TSDB.
Here I want to recommend a lightweight, high performance, open source time series database, TDengine. TDengine is a distributed TSDB and its distributed solution is also open source, it also supports SQL for easy use.

- 1
- 1