-2

Write a hivesql and display like below ouput

id     name            dob
-------------------------
1  anjan   10-16-1989

output:

id     name            dob
-------------------------
1       a              10-16-1989
1       n              10-16-1989
1       j              10-16-1989
1       a              10-16-1989
1       n              10-16-1989

and above scenario solve in spark and display same as above output

Shaido
  • 27,497
  • 23
  • 70
  • 73

1 Answers1

0

Assuming you have a dataframe (name it data) that comes from Hive like this:

+---+-----+----------+
| id| name|       dob|
+---+-----+----------+
|  1|anjan|10-16-1989|
+---+-----+----------+

you can define a user defined function in spark that transform a string into an array :

val toArray = udf((name: String) => name.toArray.map(_.toString))

Having that we can apply it on the name column:

val df = data.withColumn("name", toArray(res0("name")))

+---+---------------+----------+
| id|           name|       dob|
+---+---------------+----------+
|  1|[a, n, j, a, n]|10-16-1989|
+---+---------------+----------+

We can use now the explode function on the name column

df.withColumn("name", explode(df("name")))

+---+----+----------+
| id|name|       dob|
+---+----+----------+
|  1|   a|10-16-1989|
|  1|   n|10-16-1989|
|  1|   j|10-16-1989|
|  1|   a|10-16-1989|
|  1|   n|10-16-1989|
+---+----+----------+
dumitru
  • 2,068
  • 14
  • 23