0

root
 |-- _id: string (nullable = true)
 |-- h: string (nullable = true)
 |-- inc: string (nullable = true)
 |-- op: string (nullable = true)
 |-- ts: string (nullable = true)
 |-- webhooks: struct (nullable = false)
 |    |    |-- index: string (nullable = false)
 |    |    |-- failed_at: string (nullable = true)
 |    |    |-- status: string (nullable = true)
 |    |    |-- updated_at: string (nullable = true)

How to remove the column from (webhooks) by taking the input from list eg filterList: List[String]= List("index","status"). Is there any way to do by iterating row like the intermediate schema will change not the final schema

root
 |-- _id: string (nullable = true)
 |-- h: string (nullable = true)
 |-- inc: string (nullable = true)
 |-- op: string (nullable = true)
 |-- ts: string (nullable = true)
 |-- webhooks: struct (nullable = false)
 |    |    |-- index: string (nullable = false)
 |    |    |-- status: string (nullable = true)
vasu seth
  • 37
  • 8

2 Answers2

1

Check below code.

scala> df.printSchema
root
 |-- _id: string (nullable = true)
 |-- h: string (nullable = true)
 |-- inc: string (nullable = true)
 |-- op: string (nullable = true)
 |-- ts: string (nullable = true)
 |-- webhooks: struct (nullable = true)
 |    |-- index: string (nullable = true)
 |    |-- failed_at: string (nullable = true)
 |    |-- status: string (nullable = true)
 |    |-- updated_at: string (nullable = true)

scala> val actualColumns = df.select(s"webhooks.*").columns

scala> val removeColumns = Seq("index","status")

scala> val webhooks = struct(actualColumns.filter(c => !removeColumns.contains(c)).map(c => col(s"webhooks.${c}")):_*).as("webhooks")

Output

scala> df.withColumn("webhooks",webhooks).printSchema
root
 |-- _id: string (nullable = true)
 |-- h: string (nullable = true)
 |-- inc: string (nullable = true)
 |-- op: string (nullable = true)
 |-- ts: string (nullable = true)
 |-- webhooks: struct (nullable = false)
 |    |-- failed_at: string (nullable = true)
 |    |-- updated_at: string (nullable = true)
Srinivas
  • 8,957
  • 2
  • 12
  • 26
0

Can also look at https://stackoverflow.com/a/39943812/2204206

Can be more convenient when removing deeply nested columns

Lior Chaga
  • 1,424
  • 2
  • 21
  • 35