Reading XML on python 2.1

Question

So, i'm trying to process a XML file on python.

I'm using minidom as I'm in python 2.1 and there's no change to updating to 3.6. Currently, I have this

import xml.dom.minidom as minidom
import socket
print 'Getting the xml file'
# Get the xml contents
file = open('<filepath>')
#print file
# Get the root of the configuration file
print 'Parsing the xml'
procs = minidom.parse(file)

But I'm getting this error

Any idea? Or, better yet, another way to parse xml without me having to write my own parser...

Can you install 3rd-party modules? It'd be interesting to look into compatibility with old versions of lxml... — Charles Duffy, Sep 15 '16 at 21:40
Did you name any of your own files anything that might conflict with a built-in module name? — user2357112, Sep 15 '16 at 21:47
At the very least, is updating to 2.1.3 an option? Your line numbers don't seem to match the line numbers in the 2.1 repository branch, which probably indicates you're not even on 2.1.3. (That, or you've edited the standard library files, which would be even worse.) — user2357112, Sep 15 '16 at 21:56
@PadraicCunningham Well, when retrieving the python version, 2.1 it's the only thing that's shown — CJLopez, Sep 16 '16 at 12:20
@cricket_007 noup. Given that this script is ran on the clients server, no chance on upgrading their python version, but i'll pitch it to them — CJLopez, Sep 16 '16 at 12:22
@user2357112 well, i'll try to pitch it to them, but don't give it too much hope, because I did install 2.1.3 in my own machine in order to match the servers version, and I'm my dev enviroment it works flawlessly — CJLopez, Sep 16 '16 at 12:23
@CJLopez, I'm not sure you *did* respond to my question. You indicated that you can't install a newer Python interpreter, but didn't say anything either way about 3rd-party modules. — Charles Duffy, Sep 16 '16 at 12:48
@CharlesDuffy My bad Charles, but I don't think I'm allowed to. Besides, all 3rd party xml parser I had found required python 2.7 or newer, and as I stated, updating python on the servers looks like it not viable given that there are hundreds of python scripts used daily and they don't want to risk updating breaking them — CJLopez, Sep 16 '16 at 12:53
@CJLopez, that's why I suggested installing an **old version** of a 3rd-party XML parser. Yes, current releases will require 2.7, but 3rd-party XML parsers existed before Python 2.7 did, and their code hasn't disappeared. — Charles Duffy, Sep 16 '16 at 12:57
@CharlesDuffy really? if you can point me to any of those, I might convince my client and their IT department to let me use them. I hadn't been able to find any. Looks like I need to improve my GoogleFu — CJLopez, Sep 16 '16 at 13:48
so, the place I'd start is by taking your really longstanding XML libraries, looking at when they were introduced (or hit 1.0), and which Python interpreter releases they supported at that time. There's a bit of digging required, to be sure. — Charles Duffy, Sep 16 '16 at 13:50
Looks to me like lxml 2.0 requires Python 2.2, whereas lxml 1.3 doesn't specify that as a requirement -- so I'd suggest taking a close look at 1.3. — Charles Duffy, Sep 16 '16 at 17:27
"and there's no change to updating to 3.6" I can't understand what this is supposed to mean. Was "change to" supposed to be "chance of"? — Karl Knechtel, Aug 14 '22 at 18:46

score 2 · Accepted Answer · answered Sep 16 '16 at 18:08

So, I was able to get this working

For starters, after trying to convince to either update or install a plugin, I was notified that all python scripts are ran on jython, which mean, I have several java libraries to my disposal (wish they could had told me this quite sooner)

So, after some investigation on xml processing on jython, I found out that using Xerces and xas was the key

This is the code I finally used if any one would like to know

from java.io import StringReader

import org.xml.sax as sax
import org.apache.xerces.parsers.DOMParser as domparser

parser = domparser()
document = open('<path to file>').read()
parser.reset()
documentIS = sax.InputSource(StringReader(document))
parser.parse(documentIS)
domtree = parser.getDocument()
results = domtree.getElementsByTagName('<tag name>')
for ix in range(results.getLength()):
    item = results.item(ix).getAttribute("<attribute name>")

Hope someone else finds this usefull

Reading XML on python 2.1

1 Answers1