Questions tagged [druid]

Druid is a column-oriented open-source distributed data store written in Java.

According to the Apache Druid website:

Apache Druid is a real-time analytics database designed for fast slice-and-dice analytics ("OLAP" queries) on large data sets. Most often, Druid powers use cases where real-time ingestion, fast query performance, and high uptime are important.

Druid is commonly used as the database backend for GUIs of analytical applications, or for highly-concurrent APIs that need fast aggregations. Druid works best with event-oriented data.

597 questions
2
votes
0 answers

Data load from Kafka not working in Druid

Recently I setup Apache Druid using Docker and now it is up & running. I am able to access it via http://localhost:8888. I have setup Apache Kafka on my local machine with zookeeper. What I need is to connect Druid to Kafka. I have enabled the…
thatGuy
  • 88
  • 7
2
votes
1 answer

How to parse json data in a column with Druid SQL?

I'm trying to parse json data in a column with Druid SQL in Superset SQL lab. My table looks like this: id json_scores 0 {"foo": 20, "bar": 10} 1 {"foo": 30, "bar": 10} I'm looking for something similar to json_extract in MySQL…
emmabyp
  • 21
  • 2
2
votes
0 answers

Find Rank Using Druid via druid-datasketches extension

Need to find rank of a company lets say McDonalds in various scenarios in druid. For this purpose I came across an extension called "druid-datasketches" which I think can be helpful. After going through the documentation I was able to form this…
anarchy
  • 551
  • 2
  • 23
2
votes
1 answer

JDBC exception when trying to use a prepared statement placeholder as an argument to an aggregation function

I'm attempting to use a PreparedStatement placeholder for an argument to a sql aggregation function. The query works fine if I replace the ? placeholder with a numeric value and get rid of the setDouble call. public static String QUERY = """ …
marathon
  • 7,881
  • 17
  • 74
  • 137
2
votes
2 answers

Apache Druid – Ingesting multiple objects in flat JSON data returns only single row

I'm aiming to ingest this JSON data into Apache Druid as multiple rows. The data (about 10x more rows than this example) is served from a proprietary HTTP server that I have no control over. I cannot change how the data is presented from the HTTP…
Tell
  • 21
  • 4
2
votes
1 answer

can I join two dataSources and create a new dataSource permanently in druid

Druid now supports joins. But I see still its lil slower when we join a big fact table with a mid size dimension table. Can we do the join and create new dataSource and store the resultant dataSource for further query in druid? if so how can we do…
sunil
  • 1,259
  • 1
  • 14
  • 27
2
votes
1 answer

Give readonly access to dashboard in superset

I have a dashboard in superset, to which I want to give readonly access to some users and that user should also be able to view that dashboard in Dashboard tab. I have created a Readonly user and have given him datasource access used in that…
unknown
  • 53
  • 2
  • 9
2
votes
1 answer

Refresh Data in druid

I am using the index_parallel native batch method to ingest data to Druid from s3. I have done the initial ingestion using Tasks tab from druid UI. I want to schedule another task to do delta ingestion daily. I have gone through a lot of…
unknown
  • 53
  • 2
  • 9
2
votes
1 answer

How to convert unix time into date in druid or apache-druid

I want to use date_diff function in @druid. Where my input will be to unix_timestamp. Unix_timestamp = 1597689016000 I want this to be converted into a timestamp where I can find the difference between two times. Also please tell me, what is fiction…
marquez raj
  • 21
  • 1
  • 2
2
votes
0 answers

Druid with queries for caculating the distence for each adjancent rows

Hi I would like to calculate the distance for each two adjacent rows, for example: row1 value: x1 , y1 , z1 and row2 value: x2, y2, z2 and the results I want is the 3D distance sqrt((x1-x2)*2 + (y1-y2)*2+ (z1-z2)*2) Or does any one knows that there…
YihanBao
  • 501
  • 1
  • 4
  • 15
2
votes
0 answers

What is the best way to store daily prices on Apache Druid?

I'd like to store daily prices on Apache Druid in a way that is possible to group it by daily, weekly and monthly. I was thinking of defining open and close prices as metrics, not dimensions, but for that I would have to use first and last…
Ezer Fernandes
  • 131
  • 1
  • 6
2
votes
1 answer

How should I sink Apache-flink stream data into Druid?

I am wondering what is the best way to directly sink Apache-Flink stream data into Druid. I know Tranquility, but it does not support the latest Flink and Druid. Does anyone know a better solution for it?
xksa
  • 87
  • 1
  • 8
2
votes
2 answers

How can Druid be configured in Dataproc?

Now that Druid is made an optional component of Google Cloud Dataproc (https://cloud.google.com/dataproc/docs/concepts/components/druid), I am wondering how Druid configuration can be performed from the Dataproc cluster creation? I have tried the…
Yan Zhou
  • 15
  • 1
2
votes
1 answer

DRUID SQL LATEST() and EARLIEST() returning zero

I have this data source in druid: I'm trying to use LATEST() to return the latest coordinates of each user active in the last minute. My intention is to show their location in realtime using a mapbox chart in superset. This is my query: SELECT…
Norbor Illig
  • 658
  • 6
  • 14
2
votes
2 answers

Apply aggregate function to certain columns using SQL in Apache Druid

I have a table like this ---------------------- code sales goal ---------------------- b 7 20 b 12 20 a 9 15 c 4 3 a 4 15 And I want to perform an agg function to group by…
S_B04
  • 21
  • 3