I am taking a course that includes learning how to read XML files into R. I'm trying to read this XML document into an R object:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml
When I follow instructions I'm given, it creates an R object, but when I try to index it, it shows the whole document instead of a list of the names of the levels below the root node. When I try to index it further, R Studio (Posit) cloud crashes. I am extremely new to R and XML, so I have no clue what's wrong.
This is what the XML file looks like:
I did the following:
library(XML)
fileUrl<-"http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml"
doc<-xmlTreeParse(fileUrl, useInternal = TRUE)
rootNode<-xmlRoot(doc)
xmlName(rootNode)
(Notice I took out the "s" from https. I don't know why, but it doesn't work when you have the s in.)
It returned "response" as expected.
Then I did:
names(rootNode)
And all that returned was
row
"row"
So I tried to index it to the 1st level, hoping it would give me the first chunk (ie row_ID = "1"):
rootNode[[1]]
and it gave me literally the entire document.
But (DO NOT TRY THIS), then I tried:
rootNode[[1]][[1]]
I was hoping it was going to give me the first, individual section, ie:
<row _id="1" _uuid="93CACF6F-C8C2-4B87-95A8-8177806D5A6F" _position="1" _address="http://data.baltimorecity.gov/resource/k5ry-ef3g/1">
<name>410</name>
<zipcode>21206</zipcode>
<neighborhood>Frankford</neighborhood>
<councildistrict>2</councildistrict>
<policedistrict>NORTHEASTERN</policedistrict>
<location_1 human_address="{"address":"4509 BELAIR ROAD","city":"Baltimore","state":"MD","zip":""}" needs_recoding="true"/>
</row>
But it crashed the project on R Studio cloud, and it still won't open to this day.
Before I delete the project, I need to understand what I did wrong, and how I can index the tree parsed document to give me individual sections like the one I pasted above.