Questions tagged [apache-spark-2.2]

36 questions
0
votes
2 answers

Spark 2.2 extracting date not working from unix_timestamp

In Spark 2.2 extracting date not working from unix_timestamp Input Data: +-------------------------+ |UPDATE_TS …
marjun
  • 696
  • 5
  • 17
  • 30
0
votes
0 answers

I am facing issue with pyspark2.2 csv writer output

I want to migrate my pyspark code from 1.6 to 2.x. In 1.6 I was using syntax input_df.repartition(number_of_files) \ .write.mode(file_saveMode) \ .format(file_format) \ .option("header", "true") \ .save(nfs_path) And was getting…
SB07
  • 76
  • 7
0
votes
1 answer

Create a Vertical Table in Spark 2

How to create a vertical table in Spark 2 SQL. I am building a ETL using Spark 2 / SQL / Scala. I have data in normal table structure like. Input Table: | ID | A | B | C | D | | 1 | A1 | B1 | C1 | D1 | | 2 | A2 | B2 | C2 | D2 | Output…
0
votes
1 answer

Explode Cassandra UDT with flatmap in Spark 2.x (Scala)

I have data in Cassandra (3.11.2) as which is also my df : Data in Cassandra: id | some_data -- | --------- 1 | [{s1:"str11", s2:"str12"},{s1:"str13", s2:"str14"}] 2 | [{s1:"str21", s2:"str22"},{s1:"str23", s2:"str24"}] 3 | [{s1:"str31",…
0
votes
1 answer

Spark History Server - Identify log file that a job writes to

I want to use the Spark History Server API(http://127.0.0.1:18080/api/v1/applications/) to identify the log file in /tmp/spark-events/ that certain jobs write to. I can see that the job ID is the same as the log file name so was thinking if I had a…
runnerpaul
  • 5,942
  • 8
  • 49
  • 118
-1
votes
1 answer

Spark 2.2 dataframe [scala]

OrderNo Status1 Status2 Status3 123 Completed Pending Pending 456 Rejected Completed Completed 789 Pending In Progress Completed Above is the table which is the input data set and the expected output is below.…
Ansip
  • 73
  • 1
  • 9
1 2
3