I have an OSM file that captures a small neighborhood. http://pastebin.com/xeWJsPeY
I have Python code that does a lot of extra parsing, but an example of the main problem can be seen here:
import xml.etree.cElementTree as CET
osmfile = open('osm_example.osm','r')
for event, elem in CET.iterparse(osmfile,events = ('start',)):
if elem.tag == 'way':
if elem.get('id') == "21850789":
for child in elem:
print CET.tostring(child,encoding='utf-8')
elem.clear()
Here, and elsewhere, I noticed that the tags for a specific entry are missing (where tag is an element that looks like <tag k="highway" v="residential" />
. All of the <nd .../>
elements were read correctly, as far as I can see.
One other thing I noticed when processing the files is that when I use tostring()
on an element with a 'way'
tag, if there are errors with the <tag .../>
elements being read, it didn't append a newline to the end of it. e.g., when running
for event, elem in CET.iterparse(osmfile,events = ('start',)):
if elem.tag == 'way':
print CET.tostring(elem,encoding='utf-8')
elem.clear()
The output for an entry with missing <tag .../>
elements is
<nd ref="235476200" />
<nd ref="1865868598" /></way><way changeset="12727901" id="21853023" timestamp="2012-08-14T15:23:13Z" uid="451048" user="bbmiller" version="8" visible="true">
<nd ref="1865868557" />
versus one that is formed just fine,
<tag k="tiger:zip_left" v="60061" />
<tag k="tiger:zip_right" v="60061" />
</way>
<way changeset="15851022" id="21874389" timestamp="2013-04-24T16:33:28Z" uid="451693" user="bot-mode" version="3" visible="true">
<nd ref="235666887" />
<nd ref="235666891" />
What is the issue that is going on here?