An API call is returning DICT type response similar to the output below:
{'Account': {'id': 123, 'externalIdentifier': None, 'name': 'test acct', 'accountNumber': None, 'Rep': None, 'organizationId': 123, 'streetAddress': '123 Main Road', 'streetAddressCity': 'Town City', 'streetAddressState': 'Texas', 'streetAddressZipCode': '76123', 'contact': [{'id': 10001, 'name': 'Test test', 'extID': '9999999999'}]}}
I am attempting to build a dataframe of the Account record returned but I keep getting TypeError: StructType can not accept object 'id' in type <class 'str'>. I have tried the other methods which include adding .item(), map lambda and converting types, but always coming back to the same error.
account_schema = StructType([
StructField('id', StringType(), True),
StructField('externalIdentifier', StringType(), True),
StructField('name', StringType(), True),
StructField('Account_number', StringType(), True),
StructField('Rep', StructType([
StructField('firstName', StringType(), True),
StructField('lastName', StringType(), True),
StructField('email', StringType(), True),
StructField('id', StringType(), True),
])),
StructField('streetAddress', StringType(), True),
StructField('streetAddressCity', StringType(), True),
StructField('streetAddressState', StringType(), True),
StructField('streetAddressZipCode', StringType(), True) ])
df = spark.createDataFrame(account_response['Account'], schema=account_schema)
Any direction would be appreciated.