I developed 2 codes but the goal is the same. The first, it submit a query to apache drill by pydrill. The query in this case is a lot of select comands and a union all between them, then I save the result in a dataframe. The second code, it submit a lot of queries (each is a select) and I append the result in dataframe. Both the solution doesn't respect the columns order. For instance: select column[1] as A, column[2] as B The result is a big dataframe and its header is B, A In my case, the dataframe has 7 columns and the order is different then the clause select. One of the columns is a fqn attribut (from apache drill) to get the path of current file.
Asked
Active
Viewed 48 times
0
-
What version of Drill do you have? – Vitalii Diravka Feb 11 '19 at 10:48
-
drill 1.15.0, pandas 0.23.4, pydrill 0.3.4 and python 3.7.1 – ltito Feb 12 '19 at 07:30
-
I think it is an issue of PyDrill. Could you try the same query in Drill SqlLine or in Drill WebUI? – Vitalii Diravka Feb 12 '19 at 14:48