Questions tagged [pydruid]

19 questions
3
votes
3 answers

Can we able to do Transformation in Druid

I am having a scenario where I will be receiving data in csv files and there I need to generate some columns with the existing one. Example: Col_1 Col_2 Col_3 Col_4 abc 1 No 123 xyz 2 Yes 123 def 1 …
HEMANT PATEL
  • 77
  • 1
  • 11
3
votes
1 answer

No module named 'pydruid'

I'm following this tutorial from Druid which is to connect jupyter notebook to druid. When i ran it keep giving me ModuleNotFoundError: No module named 'pydruid' when i already installed the requirement.
Dzakirin
  • 173
  • 17
1
vote
1 answer

How to run Druid query in Python (Druid-Python connection)?

I want to run some druid queries in Python. Can someone please tell me how to do that? I did try with localhost druid and it works but it doesn't work when I use the production instance of druids which is hosted in the cloud. Here is what I did from…
Pankaj Kumar
  • 147
  • 11
1
vote
0 answers

Different default behaviors of skipping empty buckets in timeseries queries by Druid native query and Druid SQL

According to the document of Apache Druid about native query Timeseries queries normally fill empty interior time buckets with zeroes. For example, if you issue a "day" granularity timeseries query for the interval 2012-01-01/2012-01-04, and no…
Jimmy Lu
  • 348
  • 1
  • 9
1
vote
0 answers

How to handle nested array in a DRUID

My json is as below: { "id":11966121, "employer_id":175, "account_attributes":[ { "id":155387028, "is_active":false, "created_at":"2018-06-06T02:12:25.243Z", "updated_at":"2021-03-15T17:38:04.598Z" }, { …
JDev
  • 1,662
  • 4
  • 25
  • 55
1
vote
1 answer

protobuf ingestion in druid is only in running state but no data source is being created

I have done a simple druid setup using the quickstart compose file. I want to ingest protobuf from kafka to druid. I followed this link but no matter what i use for path in descriptor file URL it doesnt pickup,but in tasks it shows running. this is…
Risabh Sharma
  • 634
  • 5
  • 15
1
vote
0 answers

pydruid: use a query as datasource

I am trying to write a query in pydruid, which uses another query as datasource. Druid itself supports is by setting datasource like this: "dataSource": { "type": "query", "query": { "queryType": "groupBy", "dataSource":…
samad montazeri
  • 1,203
  • 16
  • 28
0
votes
0 answers

Apache Druid throws "OSError: HTTP Error 400: Bad Request"

I encounter with an Error when I try to reach Apache Druid datasource. Here is the code sample I used to load Druid datasource into pandas dataframe: from pydruid.client import * from pydruid.utils.aggregators import doublesum from…
ByUnal
  • 90
  • 1
  • 8
0
votes
1 answer

How to write log and data in Druid Deep Storage in AWS S3

We have a druid cluster setup and now i am trying to write the indexing-logs and data into S3 deep storage. Following are the details druid.storage.type=s3 druid.storage.bucket=bucket-name druid.storage.baseKey=druid/segments # For…
0
votes
1 answer

How to get Quantile/median values in pydruid

My goal is to query the median value of column height in my druid datasource. I was able to use other aggregations like count and count distinct values. Here's my query so far: group = query.groupby( datasource=datasource, …
Clover
  • 11
  • 6
0
votes
2 answers

How to get results of multiple aggregations in a single druid query?

Say I have the following table named t_student_details: Name Age Marks Sport City ........ (multiple columns) ====== ===== ======= ======= ====== Jason 11 45 tennis New York Mark 12 42 …
Vinay
  • 699
  • 4
  • 22
0
votes
1 answer

How to run this Druid SQL query from pydruid using "timeseries"?

The following is the code in Druid SQL. My goal is to run this code from Python. I'm able to do so using DB API, but I'm wondering if there's a way to do this with the hydroid function "timeseries" because it would go better with the rest of my…
carsof
  • 83
  • 1
  • 8
0
votes
0 answers

How to execute this druid query with python script (code included)?

This is my SQL query (it takes the average "value" over an hour for a specific device/metric combination) for a certain time period. SELECT TIME_FLOOR(__time, 'PT1h') AS "__time_time_floor", AVG("value"), COUNT(*) AS "Count" FROM…
carsof
  • 83
  • 1
  • 8
0
votes
3 answers

not able to connect superset to druid

I have Druid and superset running locally, but I am not able to connect them together. I have sample data wikiticker in Druid. I already installed pydruid with pip3: pip3 install pydruid (I am not sure if I need to install this to any particular…
Ankit
  • 131
  • 1
  • 8
0
votes
1 answer

Druid search query does not return case sensitive results

I don't want to return the records whose one dimension (name) does not have a case sensitive value say "ALEX". However, it returns the result. For example: { …
AnujaP
  • 127
  • 1
  • 11
1
2