I have been using R to scrape XML tables from a Microsoft Sharepoint page and I wish to use the 'rs:name' buried in the Schema as the names of each column, instead of the attribute names in rs:data. I am having trouble accessing these names as they are very deep in the XML tree.
The reason why I want these names is because they are the full names of the columns in the table on the Sharepoint page, not just the XML encoded names, and that when I load the data in, if there are any missing values in the table, entries will be moved across to fill them in, often wrapping back to the start.
Here is a link that I have been following for inspiration: Using R to connect to a sharepoint list
Here is an pretty similar example to the XML code (just with the names changed) https://pastebin.com/Ks2LmBS3
My code looks like:
page <- GET(url, verbose(), authenticate(username, password, type='ntlm'))
src <- httr::content(page)
src %>% xml_structure()
xmlData <- xmlParse(src, options=HUGE, useInternalNodes=TRUE)
dataList <- xmlToList(xmlRoot(xmlData)[["data"]])
dataMatrix <- do.call(rbind, dataList)
df <- data.table(dataMatrix)
however I wish to access the rs:name in Schema and use these as the column names, then populate the table with the remaining data.
Please let me know if there is anything you do not understand or need more explanation. Thank you very much for your help in advance!