0

I have a very large xml data which contains the lat long values. My goal is to convert the xml file to polygons. To do that I am first trying to convert the xml to a dataframe. I tried the xml2 and the dplyr packages but in vain. I also used the lapply function in the xml package but I am not able to convert the xml to a dataframe. My data looks like this.

<?xml version="1.0" encoding="UTF-8"?>
-<rss version="2.0
xmlns:atom="http://www.w3.org/2005/Atom"xmlns:georss="http://www.georss.org/georss">-<channel>
<title>raisen:raisengp</title>
<description>null</description>
-<link>
-<![CDATA[http://geoportal.mp.gov.in:8080/geoserver/raisen/wms
service=wms&request=GetMap&version=1.1.1&format=application%2Frss%2Bxml&layers=raisen%3Araisengp&styles=poly_gp&height=545&width=768&transparent=false&bbox=8611854.0%2C2606654.0%2C8775218.0%2C2722591.0&srs=EPSG%3A3857]]>
</link>
<atom:link rel="self"
href="http://geoportal.mp.gov.in:8080/geoserver/raisen/wms?service=wms&request=GetMap&version=1.1.1&format=application%2Frss%2Bxml&layers=raisen%3Araisengp&styles=poly_gp&height=545&width=768&transparent=false&bbox=8611854.0%2C2606654.0%2C8775218.0%2C2722591.0&srs=EPSG%3A3857"/>
-<item>
<title>raisengp.1</title>
-<link>
-<![CDATA[http://geoportal.mp.gov.in:8080/geoserver/raisen/wms/reflect?format=application%2Fatom%2Bxml&layers=raisen%3Araisengp&featureid=raisengp
</link>
-<guid>
-<![CDATA[http://geoportal.mp.gov.in:8080/geoserver/raisen/wms/reflect?format=application%2Fatom%2Bxml&layers=raisen%3Araisengp&featureid=raisengp.1]]>
</guid>
-<description>
-<![CDATA[<h4>raisengp</h4>
<ul class="textattributes">
<li><strong><span class="atr-name">districtc</span>:</strong> <span class="atr-value">446</span></li>
  <li><strong><span class="atr-name">tehsilcode</span>:</strong> <span class="atr-value">3593</span></li>
<li><strong><span class="atr-name">blockcode</span>:</strong> <span class="atr-value">1172</span></li>
</description>
<georss:polygon>23.18939277468125 77.57356245525989 23.18919469824253
77.5738162614824 23.18898075873695 77.57405868357523 23.188783235268893
77.57426764701316 23.188552245459086 77.57444602814941 23.188233192400194
77.57456604367773 23.187997858030833 77.57465301739674 23.18782708762609
77.57470568856189 23.18773716304003 77.57475944413376 23.187614345311996
77.57479779220925 23.187560141307777 77.57482903021003 23.18743125870021
77.57487370595457 23.18733905673292 77.57489457266112 23.187148669743586
77.57495472144764 23.18697715655688 77.57499546065583 23.18681329439572
77.57506290929337 23.18666518442996 77.57510904621871 </georss:polygon>

Here is my code. Kindly help me in doing this.

> library(xml2)
> library(dplyr)
> dat <- "D:/Prakshep_project/raisen-raisengp.xml"
> doc <- read_xml(dat)
> docdf <- bind_rows(lapply(xml_find_all(doc, "//georss:polygon), function(x) {
parent <- data.frame(as.list(xml_attrs(x)), stringsAsFactors=FALSE)
kids <- bind_rows(lapply(xml_children(x), function(x) as.list(xml_attrs(x))))
cbind.data.frame(parent, kids, stringsAsFactors=FALSE)
}))

Using the xml package here is the code.

library(XML)
data <- xmlTreeParse("D:/Prakshep_project/raisen-raisengp.xml",useInternalNodes = TRUE)
guid <- xpathSApply(data, "//guid" ,xmlValue)
ply <- xpathSApply(data, "//georss:polygon" ,xmlValue)
df <- data.frame(guid= unlist(guid), 
             ply = unlist(georss:polygon)
Deepthi
  • 21
  • 6
  • If this is a valid GeoRSS file (you don't provide us with a complete file or file fragment so I can't tell but it looks like it) then you can read it in using the `st_read` function of the `sf` package, into a spatial data frame. Please provide us with a complete file for a full answer. – Spacedman Mar 09 '19 at 18:03
  • Maybe this can help you: https://www.rdocumentation.org/packages/ggplot2/versions/3.1.0/topics/fortify – LocoGris Mar 09 '19 at 18:42
  • @Spacedman My GeoRSS file is extremely large. I cannot find a way to attach the complete data in my question. The above data is a snippet of my xml file. It contains coordiantes which have to be converted to a polygon. – Deepthi Mar 10 '19 at 10:09
  • @JonnyCrunch I tried that but it did not work out for my dataset. – Deepthi Mar 10 '19 at 10:40
  • Try (replace filename with path to your file) `library(sf); polys = st_read("filename.xml"); plot(polys$geom)` - install `sf` if not found. – Spacedman Mar 10 '19 at 12:06
  • If you cant share the whole file and you don't know how to subset it then show us the start up to the first `` tag, a couple of sections of `` `` tags, and everything after the very last `` tag. That should make a complete subset valid XML file. – Spacedman Mar 10 '19 at 12:08
  • How large is "extremely large"?? How many megabytes, how many "items", how complex are the polygons? – Spacedman Mar 10 '19 at 12:09
  • My data is 36.3 mb it contains the shapefiles of the district of a entire state. The methodology I had previously adapted to convert an xml to a shp was to first convert the xml >> dataframe and the dataframe >> SpatialPolygonsDataframe and the to SpatialPolygons. I was to follow this code to convert into polygons after converting into a dataframe.[link] https://stackoverflow.com/questions/52669779/convert-sets-of-spatial-coordinates-to-polygons-in-r-using-sf?noredirect=1&lq= – Deepthi Mar 10 '19 at 13:06
  • @Spacedman I tried the above suggested code using the 'sf' package. It worked fine. Thankyou. But I have trouble in exporting the shapefile. I am not able to convert to a 'SpatialPolygonsDataFrame' and the 'WriteOGR' function isn't working – Deepthi Mar 10 '19 at 14:18
  • Are you confusing `sf` and `sp` classes? The `sf` package has its own data classes that don't work with function from `sp` and `rgdal` which is probably where your `writeOGR` is coming from. If you are struggling to convert between sf and sp then start a new question and detail your process. – Spacedman Mar 10 '19 at 14:58
  • Yes it worked out for me. I used it 'st_write 'function to export the shapefile. Thankyou for you help @Spacedman – Deepthi Mar 11 '19 at 13:50

1 Answers1

0

I was confusing sf and sp classes. The sf package has its own data classes that don't work with function from sp and rgdal.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Deepthi
  • 21
  • 6