I have two tables with the below example schemas. The keys for table A are nested in a list in table B. I would like to join table A and table B based on the table A keys to generate table C. The values from table A should be a nested structure in table C based on the list of keyAs in table B. How can I do this using pyspark? Thanks!
Table A
root
|-- item1: string (nullable = true)
|-- item2: long (nullable = true)
|-- keyA: string (nullable = true)
Table B
root
|-- item1: string (nullable = true)
|-- item2: long (nullable = true)
|-- keyB: string (nullable = true)
|-- keyAs: array (nullable = true)
| |-- element: string (containsNull = true)
Table C
root
|-- item1: string (nullable = true)
|-- item2: long (nullable = true)
|-- keyB: string (nullable = true)
|-- keyAs: array (nullable = true)
| |-- element: string (containsNull = true)
|-- valueAs: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- item1: string (nullable = true)
| | |-- item2: long (nullable = true)
| | |-- keyA: string (nullable = true)