1

I have an eye-tracker uses TCP/IP communication and XML to send data between client (application) and server (eye-tracker). The following is an example of the XML data string I receive continuously while the eye-tracker is on. What I would like to do is be able to use the data, FPOGX and FPOGY, as an input to another function I have. The problem is they are not variables and you can't just simply call upon them. How do I parse this data stream? This is the first time I've worked with XML. Examples would be greatly appreciated. Thanks!

CLIENT SEND: <SET ID="ENABLE_SEND_COUNTER" STATE="1" />
SERVER SEND: <ACK ID="ENABLE_SEND_COUNTER" STATE="1" />
CLIENT SEND: <SET ID="ENABLE_SEND_POG_FIX" STATE="1" />
SERVER SEND: <ACK ID="ENABLE_SEND_POG_FIX" STATE="1" />
CLIENT SEND: <SET ID="ENABLE_SEND_DATA" STATE="1" />
SERVER SEND: <ACK ID="ENABLE_SEND_DATA" STATE="1" />
SERVER SEND: <REC CNT="72" FPOGX="0.5065" FPOGY="0.4390"
FPOGD="0.078" FPOGID="468" FPOGV="1"/>
SERVER SEND: <REC CNT="73" FPOGX="0.5071" FPOGY="0.4409"
FPOGD="0.094" FPOGID="468" FPOGV="1"/>
SERVER SEND: <REC CNT="74" FPOGX="0.5077" FPOGY="0.4428"
FPOGD="0.109" FPOGID="468" FPOGV="1"/>

Here is a snippet of some parts of the code:

import xml.etree.cElementTree as ET
import cv2
import cv
import socket

# Code to grab different data from eye-tracker
'...'
# Code to create window and initialize camera
'...'
def xmlParse():
    rxdat = s.recv(1024)  # Syntax from eye-tracker to grab XML data stream of <REC />
    if(rxdat.find("ACK") == 1):  # First two XML have the <ACK /> tag but I don't need those
        pass
    else: # Here is the part where it parses and converts the data to float
        rxdat = '<data>' + rxdat + '</data>' 
        xml = ET.fromstring(rxdat)
        for element in xml:
            X = float(xml[0].attrib['FPOGX'])
            Y = float(xml[0].attrib['FPOGY'])
        return (X, Y)

# Def to average samples of incoming X and Y
'...'
# Def that uses xmlParse() and average() to return the averages of X and Y
'...'
# Def for mouse click events
'...'
# Some code that makes our window graphics
'...'
for i in range(0,2):    # Round-about way to get rid of the first two "NoneType"
    xmlParse()

while True:
    Img = cv.QueryFrame(capture) # capture defined earlier
    drawarrow(polyF, polyB, polyL, polyR) # Our window graphics definition
    cv.ShowImage("window", Img)
    (X, Y) = gazeCoordinates() # Def that uses xmlParse and average to return the averages of X and Y
    if cv.WaitKey(20) & 0xFF == 27:
        break

cv2.destroyAllWindows()

The error given is ParseError: not well-formed (invalid token) and points to the xml = ET.fromstring(rxdat) of the code

The definition xmlParse() by itself and just printing out the results works. But once I start adding in the windows, graphics, and using the data, it starts giving out that error.

1 Answers1

2

Assuming you don't need to parse all of the text above (which is not properly xml when taken all together) but rather just a single xml element at a time, I would suggest trying something like the following. You will end up with a dictionary of attributes, which contain the key/value pairs you are after.

>>> import xml.etree.cElementTree as ET
>>> xml_string = '<REC CNT="72" FPOGX="0.5065" FPOGY="0.4390" FPOGD="0.078" FPOGID="468" FPOGV="1"/>'
>>> xml = ET.fromstring(xml_string)
>>> xml.attrib  # a dict
{'CNT': '72', 'FPOGV': '1', 'FPOGY': '0.4390', 'FPOGX': '0.5065', 'FPOGD': '0.078', 'FPOGID': '468'}
>>> xml.attrib['FPOGX'], xml.attrib['FPOGY']
('0.5065', '0.4390')

You can check out the documentation for xml.etree.ElementTree here.

EDIT

Regarding your comment, you could try to wrap your string in an xml element before parsing it in order to code around any junk that might be included after (or before) the xml. For example, you could try this (note the "junk" that I have added to the end of the first xml string):

>>> xml_string = '<REC CNT="72" FPOGX="0.5065" FPOGY="0.4390" FPOGD="0.078" FPOGID="468" FPOGV="1"/>here is some junk that should not be here and that does not fit into xml.'
>>> xml_string = '<data>' + xml_string + '</data>'  # makes sure that the xml has an outer tag
>>> xml = ET.fromstring(xml_string)
>>> for element in xml:  # now need to iterate through <data> tag
    print element.attrib  # a dict
    {'CNT': '72', 'FPOGV': '1', 'FPOGY': '0.4390', 'FPOGX': '0.5065', 'FPOGD': '0.078', 'FPOGID': '468'}
>>> xml[0].attrib['FPOGX'], xml[0].attrib['FPOGY']  # or you can find attributes by indices (like a list)
    ('0.5065', '0.4390')

EDIT 2

Your Python looks just fine. The problem is with a character (or characters) that you are receiving in the xml string. (The <data></data> element is also fine.) You could figure out which token is giving you trouble by replacing this:

xml = ET.fromstring(rxdat)

with this:

try:
    xml = ET.fromstring(rxdat)
except:
    print rxdat  # will print the string or strings it cannot parse

You may need to escape a character or group of characters, depending on what you find out from this test.

Justin O Barber
  • 11,291
  • 2
  • 40
  • 45
  • This would give me the error `Traceback (most recent call last): File "C:\Users\Jenny\Desktop\Team Design\GazeXMLtest.py", line 26, in xml = ET.fromstring(rxdat) File "", line 124, in XML ParseError: junk after document element: line 2, column 0` most of the times. Sometimes it will run perfectly and then sometimes it would run and give the error at a later time. Any ideas why this is happening? – user3121062 Jan 18 '14 at 22:33
  • @user3121062 You can try to add a wrapper around the data before you parse the xml. I will update the answer with an example in just a moment. – Justin O Barber Jan 18 '14 at 22:42
  • @user3121062 I updated the answer, which might help catch the "junk". – Justin O Barber Jan 18 '14 at 22:50
  • Thanks for responding super fast. That worked perfectly, until we combined it to our main code and then it started giving this error `ParseError: not well-formed (invalid token)` most of the times. We are using it along with opencv, might that have something to do with it? – user3121062 Jan 19 '14 at 01:44
  • @user3121062 Can you post the relevant code (where you are adding this data to your main script) to your question? – Justin O Barber Jan 19 '14 at 02:22
  • Added a snippet of the code. Let me know if any of those that I commented out would be relevant to the error. – user3121062 Jan 19 '14 at 03:23
  • @user3121062 I added a second edit to my answer. I offered a means for troubleshooting the xml. The Python code is fine; the problem is with the xml, and you can probably discover that problem using the test I suggest above in edit 2. – Justin O Barber Jan 19 '14 at 03:27
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/45560/discussion-between-user3121062-and-275365) – user3121062 Jan 19 '14 at 03:52