2

I have a text file and would like to extract the <gml:pos>73664.300 836542.700</gml:pos> from it. More precisely I would like to get the GPS coordinate system [73664.300 836542.700] from the pos tag. The file contains multiple <wfs:member> and each of them has a <gml:pos> (deepest layer).

<?xml version='1.0' encoding='UTF-8'?>
<wfs:FeatureCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wfs/2.0 http://schemas.opengis.net/wfs/2.0/wfs.xsd http://www.opengis.net/gml/3.2 http://schemas.opengis.net/gml/3.2.1/gml.xsd http://www.deegree.org/app https://web.de/feature_descr?SERVICE=WFS&amp;VERSION=2.0.0&amp;REQUEST=DescribeFeatureType&amp;OUTPUTFORMAT=application%2Fgml%2Bxml%3B+version%3D3.2&amp;TYPENAME=app:lsa_data&amp;NAMESPACES=xmlns(app,http%3A%2F%2Fwww.deegree.org%2Fapp)" xmlns:wfs="http://www.opengis.net/wfs/2.0" timeStamp="2020-11-18T15:01:17Z" xmlns:gml="http://www.opengis.net/gml/3.2" numberMatched="unknown" numberReturned="0">
  <!--NOTE: numberReturned attribute should be 'unknown' as well, but this would not validate against the current version of the WFS 2.0 schema (change upcoming). See change request (CR 144): https://portal.opengeospatial.org/files?fact_id=6798.-->
  <wfs:member>
    <app:dat_set xmlns:app="http://www.deegree.org/app" gml:id="app:dat_set_1">
      <app:point>2</app:point>
      <app:art>K         </app:art>
      <app:L_Name>westt / woustest             </app:L_Name>
      <app:geom>
        <!--Inlined geometry 'data_1_APP_GEOM'-->
        <gml:MultiPoint gml:id="data_1_APP_GEOM" srsName="EPSG:25832">
          <gml:pointMember>
            <gml:Point gml:id="GEOMETRY_ad608059-f297-4554-8464-cdde248cb531" srsName="EPSG:25832">
              <gml:pos>73664.300 836542.700</gml:pos>
            </gml:Point>
          </gml:pointMember>
        </gml:MultiPoint>
      </app:geom>
    </app:lsa_pointdata>
  </wfs:member>
  <wfs:member>
    <app:dat_set xmlns:app="http://www.deegree.org/app" gml:id="app:dat_set_2">
      <app:point>3</app:point>
      <app:art>K         </app:art>
      <app:L_Name>route / riztr        </app:L_Name>
      <app:geom>
        <!--Inlined geometry 'data_2_APP_GEOM'-->
        <gml:MultiPoint gml:id="data_2_APP_GEOM" srsName="EPSG:25832">
          <gml:pointMember>
            <gml:Point gml:id="GEOMETRY_440d8630-b674-4768-a5b7-3fab46d9ac8c" srsName="EPSG:25832">
              <gml:pos>74354.900 837456.300</gml:pos>
            </gml:Point>
          </gml:pointMember>
        </gml:MultiPoint>
      </app:geom>
    </app:lsa_pointdata>
  </wfs:member>
  <wfs:member>
    ...
...

How could I get those gps coordinates ?
Thank you in advance.

Kyv
  • 615
  • 6
  • 26

1 Answers1

3

You can use lxml and XPATH.

data = b'''\
<?xml version='1.0' encoding='UTF-8'?>
<wfs:FeatureCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wfs/2.0 http://schemas.opengis.net/wfs/2.0/wfs.xsd http://www.opengis.net/gml/3.2 http://schemas.opengis.net/gml/3.2.1/gml.xsd http://www.deegree.org/app https://web.de/feature_descr?SERVICE=WFS&amp;VERSION=2.0.0&amp;REQUEST=DescribeFeatureType&amp;OUTPUTFORMAT=application%2Fgml%2Bxml%3B+version%3D3.2&amp;TYPENAME=app:lsa_data&amp;NAMESPACES=xmlns(app,http%3A%2F%2Fwww.deegree.org%2Fapp)" xmlns:wfs="http://www.opengis.net/wfs/2.0" timeStamp="2020-11-18T15:01:17Z" xmlns:gml="http://www.opengis.net/gml/3.2" numberMatched="unknown" numberReturned="0">
  <!--NOTE: numberReturned attribute should be 'unknown' as well, but this would not validate against the current version of the WFS 2.0 schema (change upcoming). See change request (CR 144): https://portal.opengeospatial.org/files?fact_id=6798.-->
  <wfs:member>
    <app:dat_set xmlns:app="http://www.deegree.org/app" gml:id="app:dat_set_1">
      <app:point>2</app:point>
      <app:art>K         </app:art>
      <app:L_Name>westt / woustest             </app:L_Name>
      <app:geom>
        <!--Inlined geometry 'data_1_APP_GEOM'-->
        <gml:MultiPoint gml:id="data_1_APP_GEOM" srsName="EPSG:25832">
          <gml:pointMember>
            <gml:Point gml:id="GEOMETRY_ad608059-f297-4554-8464-cdde248cb531" srsName="EPSG:25832">
              <gml:pos>73664.300 836542.700</gml:pos>
            </gml:Point>
          </gml:pointMember>
        </gml:MultiPoint>
      </app:geom>
    </app:dat_set>
  </wfs:member>
  <wfs:member>
    <app:dat_set xmlns:app="http://www.deegree.org/app" gml:id="app:dat_set_2">
      <app:point>3</app:point>
      <app:art>K         </app:art>
      <app:L_Name>route / riztr        </app:L_Name>
      <app:geom>
        <!--Inlined geometry 'data_2_APP_GEOM'-->
        <gml:MultiPoint gml:id="data_2_APP_GEOM" srsName="EPSG:25832">
          <gml:pointMember>
            <gml:Point gml:id="GEOMETRY_440d8630-b674-4768-a5b7-3fab46d9ac8c" srsName="EPSG:25832">
              <gml:pos>74354.900 837456.300</gml:pos>
            </gml:Point>
          </gml:pointMember>
        </gml:MultiPoint>
      </app:geom>
    </app:dat_set>
  </wfs:member>
</wfs:FeatureCollection>
'''

from lxml import etree
from io import BytesIO
f = BytesIO(data)

ns = {"gml": "http://www.opengis.net/gml/3.2"}
tree = etree.parse(f)
for e in tree.findall("//gml:pos", ns):
    print(e.text)


  • Thank you so much for your reply @Justin. It works fine. There is however a huge text file, so I cannot copy and paste it in a python script. So, I have added `data = b'''\` at the beginning and ` ''' ` at the end of the data as you did, then I have tried to open it : `data_set = open("path_data/col_data.txt","a")` , and I get the error message : `TypeError: a bytes-like object is required, not '_io.TextIOWrapper'` at `f = BytesIO(data_set)`. Is there a way to manage it ? Thanks – Kyv Nov 18 '20 at 18:05
  • The data does not need to be included in your python script. I just did not want to create a separate file just for me to give you an answer. That was for demonstration purposes. –  Nov 18 '20 at 18:25
  • Ok I see. I however get and error while reading the file. `data_set = open("path_data/col_data.txt","r")`. `data_set` outputs `<_io.TextIOWrapper name='path_data/col_data.txt' mode='r' encoding='UTF-8'>`. Could you please support me over there as well ? – Kyv Nov 18 '20 at 18:34
  • Have you tried `tree = etree.parse(data_set)` after you do `data_set = open("path_data/col_data.txt","r")`? –  Nov 18 '20 at 18:37
  • See https://lxml.de/tutorial.html#parsing-from-strings-and-files, in particular, the section about "The parse() function" –  Nov 18 '20 at 18:39