Questions tagged [druid]

Druid is a column-oriented open-source distributed data store written in Java.

According to the Apache Druid website:

Apache Druid is a real-time analytics database designed for fast slice-and-dice analytics ("OLAP" queries) on large data sets. Most often, Druid powers use cases where real-time ingestion, fast query performance, and high uptime are important.

Druid is commonly used as the database backend for GUIs of analytical applications, or for highly-concurrent APIs that need fast aggregations. Druid works best with event-oriented data.

597 questions
3
votes
1 answer

Resource limit exceeded in druid groupBy Query

I am trying to run groupBy Query above the limit of 500k data. I am getting this error. { "error": "Resource limit exceeded", "errorMessage": "Not enough dictionary space to execute this query. Try increasing…
Salman S
  • 47
  • 1
  • 15
3
votes
1 answer

Parquet Data timestamp columns INT96 not yet implemented in Druid Overlord Hadoop task

Context: I am able to submit a MapReduce job from druid overlord to an EMR. My Data source is in S3 in Parquet format. I have a timestamp column (INT96) in parquet data which is not supported in Avroschema. Error is while parsing the timestamp…
Shiva Achari
  • 955
  • 1
  • 9
  • 18
3
votes
0 answers

Metabase: Date Range filter to dashboards created by native query

I use Metabase for data visualization. Druid (imply-2.2.3) is data storage. Created Metabase Questions I put on the dashboard and filter them all by Date Range. When I try to add Date Range filter to question created by native query, metabase just…
Bo.
  • 605
  • 1
  • 12
  • 28
3
votes
0 answers

Tranquility server would not send data to druid

I'm using imply-2.2.3. Here is my tranquility server configuration: { "dataSources" : [ { "spec" : { "dataSchema" : { "dataSource" : "tutorial-tranquility-server", "parser" : { "type" : "string", …
Haonan Chen
  • 890
  • 1
  • 6
  • 11
3
votes
1 answer

segmentGranularity in Druid indexing task; exact meaning & implication during indexing

I still don't quite get this "segmentGranularity" in Druid. This page is quite ambiguous: http://druid.io/docs/latest/design/segments.html . It goes on mentioning segmentGranularity but it talks more about intervals (in the first paragraph). Anyway,…
Cokorda Raka
  • 4,375
  • 6
  • 36
  • 54
3
votes
1 answer

What are differences between Druid and ElasticSearch ? What are advantages for both?

I am pretty new with Druid and I don't get my answers regarding the comparison with ElasticSearch. I found this link: druid vs Elasticsearch but it does not give the differences and advantages. Can anyone explain me that or give me some links that I…
user6134689
3
votes
1 answer

How to perform a SELECT in the results returned from a GROUP BY Druid?

I am having a hard time converting this simple SQL Query below into Druid: SELECT country, city, Count(*) FROM people_data WHERE name="Mary" GROUP BY country, city; So I came up with this query so far: { "queryType": "groupBy", "dataSource"…
user5228393
3
votes
1 answer

How to add Post Aggregation value fields as Metric in Druid io

I am using druid io 0.9.0. I am trying to add a post aggregation field as a metric spec. My Intention is to show the value of the post aggregation field similar to how a metric (measures) are shown (in Druid io using Pivot). My Druid io schema file…
3
votes
1 answer

Java - java.lang.NoClassDefFoundError: com/google/inject/internal/util/$Preconditions

I'm working on an extension for druid that uses jclouds for Rackspace Cloud Files and I encountered a problem with Google guice and I'm not very confident with Java. I already saw this question, but it doesn't seem that there's a conflict in guice…
se7entyse7en
  • 4,310
  • 7
  • 33
  • 50
3
votes
4 answers

How realtime data input to Druid?

I have analytic server (for example click counter). I want to send data to druid using some api. How should I do that? Can I use it as replacement for google analytics?
Aryan
  • 2,675
  • 5
  • 24
  • 33
2
votes
0 answers

How to assign default role automatically for all dashboards in superset

We have DASHBOARD_RBAC enabled in superset, it is showing roles menu but no roles are populated as like below. I want to assign Gamma Role as default role for every dashboard which is published in superset programatically. When ever new dashboard…
kalyan4uonly
  • 317
  • 1
  • 12
2
votes
1 answer

How to properly write regexp_extract in druid ingestion spec?

I am trying to write a transformation spec in druid with regexp_extract. The regexp_extract function works fine in the query itself. The column data looks something like { "ID":"2",.... } SELECT regexp_extract(input, '(?<=\"ID\":\")(\d+)(?=\",)',…
user12331
  • 486
  • 7
  • 22
2
votes
3 answers

Apache Druid segment granuality

In Apache Druid configuration you can select the granuality of the segments (hour/day/week/etc.). What will happen if you change the granuality later? Will the new settings be applied just to the new data and old segments will be left as it is, or…
user1258683
  • 59
  • 1
  • 7
2
votes
2 answers

How to use multiple count distinct in the same query with other columns in Druid SQL?

I'm trying to use three projections in same query like below in a Druid environment: select __time, count(distinct col1), count(distinct case when (condition1 and condition2 then (concat(col2,TIME_FORMAT(__time))) else 0 end ) from table where…
N S
  • 41
  • 1
  • 5
2
votes
0 answers

Apache Druid throwing network error in console

When we are trying to query a datasource , the query runs for 5+ mins and throws network error in console. We are trying to fetch some huge result in millions. Is this some limitation in druid, where we can't fetch huge records? Other aggregated…
Vinoth Karthick
  • 905
  • 9
  • 27