2

I would like to store a large amount of timeseries from devices. Also these timeseries have to be validated, can be modified by an operator and have to be exported to other systems. Holes in the timeseries must be found. Timeseries must be shown in the UI filtered by serialnumber and date range.

We have thought about using hadoop, hbase, opentsdb and spark for this scenario.

What do you think about it? Can Spark connect to opentsdb easily?

Thanks

Pablo Castilla
  • 2,723
  • 2
  • 28
  • 33

3 Answers3

2

OpenTSDB is really great for storing large amount of time series data. Internally, it is underpinned by HBase - which means that it had to find a way around HBase's limitations in order to perform well. As a result, the representation of time series is highly optimized and not easy to decode. AFAIK, there is no out-of-the-box connector that would allow to fetch data from OpenTSDB into Spark.

The following GitHub project might provide you with some guidance:

Achak1987's connector

If you are looking for libs that would help you with time series, have a look at spark-ts - it contains useful functions for missing data imputation as well.

bear911
  • 349
  • 2
  • 8
1

Take a look at Axibase Time Series Database which has a rather unique versioning feature to maintain a history of value changes for the same timestamp. Once enabled with per-metric granularity, the database keeps track of source, status and times of value modifications for audit trail or data reconciliation.

We have customers streaming data from Spark apps using Network API, typically once data is enriched with additional metadata (aks series tags) for downstream reporting.

You can query data from ATSD with REST API or SQL.

Disclaimer: I work for Axibase.

kghamilton
  • 23
  • 7
Sergei Rodionov
  • 4,079
  • 6
  • 27
  • 44
1

Warp 10 offers the WarpScript language which can be used from Spark/Pig/Flink to manipulate time series and access data stored in Warp 10 via a Warp10InputFormat.

Warp 10 is Open Source and available at www.warp10.io

Disclaimer: I'm CTO of Cityzen Data, maker of Warp 10.

herberts
  • 304
  • 1
  • 4