3

I need to import a CSV file that contains several fields, I must later loop on some fields that interest us to recover the data contained in it.

In the file there is a field named query that contains SQL queries that must be executed and store in another CSV file that will contain the fields to retrieve as well as the results of each query.

Below is my code so far:

// step1:read the file
val table_requete = spark.read.format("com.databricks.spark.csv").option("header","true").option("delimiter", ";").load("/user/swychowski/ClientAnlytics_Controle/00_Params/filtre.csv")
req.registerTempTable("req")
// step2:read the file

However, I dont know how to loop and store on another file at the same time.

Shwabster
  • 479
  • 1
  • 10
  • 27
dkh
  • 51
  • 2
  • Can you add some input and output that you want ? – koiralo Jul 02 '19 at 14:51
  • what do you mean by add some input and output? – dkh Jul 02 '19 at 14:59
  • If you add sample of csv, query file and output it would be easy to help. – koiralo Jul 02 '19 at 15:07
  • this the input file controled tab;USD_ID;NCA_ID;CONTROLE_ID;Criteria;query;Actif;Prometheus;OK AGG_CONTACT;USD-5;NCA-15;1;nbr_visits_L3Y = 0 for prospect ;select count(*) from agg_contact where nbr_visits_l3y <> 0 and typology_id='07';1;; AGG_HIERARCHIE_MAGASIN;USD-11;NCA-23;1;nbr_client_ytm is not null ;select count(*) from agg_hie_magasin where nbr_client_ytm is null;1;; – dkh Jul 02 '19 at 15:12
  • for the output file i only want the fields ( controled tab;USD_ID;NCA_ID;CONTROLE_ID;result_query) – dkh Jul 02 '19 at 15:15
  • so where should this query be executed? RDBMS or Spark/Hive? – abiratsis Jul 03 '19 at 15:32
  • the query is executed on hive database – dkh Jul 03 '19 at 16:10
  • OK I see two more questions, are the queries similar to this one `select count() from agg_contact where nbr_visits_l3y <> 0 and typology_id='07';`? And what is the execution engine of Hive (might be Spark or Hadoop MapReduce)? – abiratsis Jul 03 '19 at 19:35
  • @AlexandrosBiratsis yes the queries are similar to that one, the execution engine is Spark – dkh Jul 04 '19 at 08:52

0 Answers0