I am reading a JSON object from Salesforce. The object is irregular in the sense that some nested arrays are empty and some are not. How to deal with this in tidyjson?
I am setting up an API with Salesforce in R. The objective is to get meaningful data out of Salesforce to process in R.
json <- '
{
"totalSize": [
355710
],
"done": [
false
],
"nextRecordsUrl": [
"/services/data/v45.0/query/01gc000001L8zdkAAB-749"
],
"records": [
{
"attributes": {
"type": "Order_Line__c",
"url": "/services/data/v45.0/sobjects/Order_Line__c/a0T1N000009aZ9lUAE"
},
"Id": "a0T1N000009aZ9lUAE",
"Name": "OrderLine-1099369",
"SO_Number_Formula__c": "548402-2.3",
"Ship_From_Inventory__c": "XXX",
"RMA_Number__c": "548402",
"Part_Number__c": "01t1N00000JNeAQQA1",
"Marketing_Part__c": "XXXXXXXXXXX",
"Family__c": "XXXXXXXX",
"Serial_Numbers__r": {
"records": {}
}
},
{
"attributes": {
"type": "Order_Line__c",
"url": "/services/data/v45.0/sobjects/Order_Line__c/a0T1N000009aZ9mUAE"
},
"Id": "a0T1N000009aZ9mUAE",
"Name": "OrderLine-1099370",
"SO_Number_Formula__c": "962816-1.1",
"Ship_From_Inventory__c": "XXX",
"RMA_Number__c": "962816",
"Part_Number__c": "01t1N00000JNc3qQAD",
"Marketing_Part__c": "XXXXXXXXXX",
"Family__c": "XXXXXXX",
"RMA_Received_Date__c": "2019-02-18",
"Serial_Numbers__r": {
"totalSize": 1,
"done": true,
"records": [
{
"attributes": {
"type": "Serial_Number__c",
"url": "/services/data/v45.0/sobjects/Serial_Number__c/a0X1N00000NoyAjUAJ"
},
"Id": "a0X1N00000NoyAjUAJ",
"Name": "SN217426",
"Legacy_Line_Id__c": "962816SN217426",
"Customer_Name__c": "XXXXXX",
"Original_Shipment_Date__c": "2018-06-26",
"Disposition__c": "Pending",
"Status__c": "FailureVerification"
}
]
}
}
]
}
'
mydata <- json %>%
as.tbl_json %>%
enter_object("records") %>%
gather_array() %>%
spread_values(
Id = jstring("Id"),
Name = jstring("Name"),
SO_Number_Formula = jstring("SO_Number_Formula__c"),
Ship_From_Inventory = jstring("Ship_From_Inventory__c"),
RMA_Number = jstring("RMA_Number__c"),
Part_Number = jstring("Part_Number__c"),
Marketing_Part = jstring("Marketing_Part__c"),
Family = jstring("Family__c")) %>%
enter_object("Serial_Numbers__r") %>%
enter_object("records") %>%
gather_ %>%
spread_values(
Id = jstring("Id"))
The irregularity is in the [records][Serial_Numbers__r][records]. In this example, the first occurrence is empty {} and the second occurrence is not empty. The code generates the following error when executing gather_keys or gather _array: Error in gather_keys(.) : 1 records are values not objects Error in gather_array(.) : 1 records are values not arrays
I am thinking that this is caused by the empty array [records]. There is plenty of such irregularity in the Salesforce output: some records have detailed nested data, some do not. How can I deal with this?