0

I have a DeltaTable with several columns that are ArrayTypes, containing StructTypes.

I'm trying to add an extra column into the StructType, but I am running into issues because it is wrapped in an ArrayType.

Hoping someone has a way to do this, or similar method that doesn't involve me backing up the table, creating the table again from scratch and restoring the data from the back-up.

I am currently using this command:

spark.sql(f"ALTER TABLE {db}.ads ADD COLUMNS (response.monkey boolean AFTER dq_conveyed)")

Which would work if it was only a StructType, but is failing because it's an array

This is the current Schema I'm trying to make the change to:

StructField(response,ArrayType(StructType(
    StructField(encounter_uid,StringType,true),
    StructField(patient_uid,StringType,true),
    StructField(call_sign,StringType,true),
    StructField(time_resource_allocated,StringType,true),
    StructField(time_resource_arrived_at_receiving_location,StringType,true),
    StructField(time_of_patient_handover,StringType,true),
    StructField(time_clear,StringType,true),
    StructField(response_type,StringType,true),
    StructField(time_resource_mobilised,StringType,true),
    StructField(time_resource_arrived_on_scene,StringType,true),
    StructField(time_stood_down,StringType,true),
    StructField(time_resource_left_scene,StringType,true),
    StructField(highest_skill_level_on_vehicle,StringType,true),
    StructField(responding_organisation_type,StringType,true),
    StructField(dq_conveyed,BooleanType,true)),
true),true)

I'm following this documentation:

https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-alter-table.html

1 Answers1

3

You can do:

spark.sql(f"ALTER TABLE {db}.ads ADD COLUMNS (response.element.monkey boolean AFTER dq_conveyed)")

Notice the element keyword. Let me know if this doesn't work for you!

user12671287
  • 131
  • 1
  • 2
  • Unfortunately that hasn't worked either. I get the following error: _Error in SQL statement: AnalysisException: Couldn't find column response.element in_ And then in prints out the schema of the table, which does contain and element as you've suggested |-- response: array (nullable = true) | |-- element: struct (containsNull = true) – Lester Drake Sep 13 '21 at 09:52
  • which version of DBR are you using? – user12671287 Sep 14 '21 at 15:38
  • Hi, I have long since discovered that this is due to using an older version of the DBR. – Lester Drake Apr 11 '23 at 10:52
  • Just wanted to confirm that I have now had an opportunity to test the solution in DBR 10.4 and this solution including ".element." where there is an array does work! – Lester Drake Jul 04 '23 at 11:40