0

I Have this DataFrame.

joinedDF schema

root
 |-- installment_id: long (nullable = true)
 |-- payment_date: date (nullable = true)
 |-- payment_method: string (nullable = true)
 |-- payment_id: string (nullable = true)
 |-- paid_amount: double (nullable = true)
 |-- loan_id: string (nullable = true)
 |-- period: string (nullable = true)
 |-- accepted_at: string (nullable = true)
 |-- payday: string (nullable = true)
 |-- interest_rate: string (nullable = true)

I need create other DataFrame to create an Array with the columns.

 payments ARRAY<STRUCT<id: INT, payment_date: STRING, method: STRING, amount: DOUBLE>>

My DataFrame final should be

CREATE EXTERNAL TABLE loan_documents (
  loan_id INT,
  period INT,
  accepted_at TIMESTAMP,
  payday INT,enter code here
  interest_rate DOUBLE,
  payments ARRAY<STRUCT<id: INT, payment_date: STRING, method: STRING, amount: DOUBLE>>
)
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 2
    Dataframe within a Dataframe is not a good idea as it will reduce the efficiency of the code.. Refer here for more details: https://stackoverflow.com/questions/17954520/pandas-dataframe-within-dataframe – illusionx Apr 13 '20 at 13:48

1 Answers1

0

To have a nested structure, you can follow like this

#import what is needed
from pyspark.sql.types import ArrayType, StructType,StructField, StringType, LongType,DateType,DoubleType, DateType,DecimalType

#define payments schema first
paymentsSchema = ArrayType(\
StructType([\
StructField("id", LongType(), False),\
StructField("payment_date", StringType(), False),\
StructField("method", StringType(), False),\
StructField("amount", DoubleType(), False)\
]))\


#use the payment schema in loan schema.
loanSchema = StructType([\
StructField("loan_id", LongType(), False),\
StructField("period", LongType(), False),\
StructField("accepted_at", DateType(), False),\
StructField("payday", DecimalType(), False),\
StructField("interest_rate", DecimalType(), False),\
StructField("payment", paymentsSchema, False)\
])

enter image description here

H Roy
  • 597
  • 5
  • 10