2

I am trying to read a directory tree to write it in an xml file without too much sucess:

    # -*- coding: utf-8 -*-
"""
Created on Tue Jan 31 13:30:22 2012

@author: Jean-Patrick Pommier
"""
import lxml.etree as et
import os
''''
Lire l'arboresence d'un répertoire projet
                    P
                  / | \
                 A  B  C
               / |\ |\  |\
              a  b ca b c e
Stocker dans un fichier xml
<P>
    <A>
        <a>
        <b>
        <c>
    </A>
    <B>
        <a>
        <b>
    </B>
    <C>
        <c>
        <e>
    </C>
</P>
'''
def makeNodes(parentxml,leveldirlist):
        #print 'parent',parentxml
        print 'chidren',leveldirlist
        for d in leveldirlist:
            child=et.Element(d)
            parentxml.append(child)

if __name__ == '__main__':
    topdir='/home/claire/Applications/ProjetPython/testxml/biblio'
    projetxml=et.Element('Project')#racine       
    parent=projetxml

    for roots, dirs, files in os.walk(topdir):
        print roots#, '*',dirs, '*',files,'\n'
        makeNodes(parent,dirs)

    print(et.tostring(projetxml,pretty_print=True))

All the subdirectories become "root's chidren":

<Project>
  <Roman/>
  <Cuisine/>
  <Essais/>
  <Science/>
  <r20s/>
  <r19s/>
  <Amerique/>
  <France/>
  <Asie/>
  <Religion/>
  <Politique/>
  <maths/>
  <physique/>
 </Project>

where Amerique, France, Asie should be subdirectrories of cuisine.

Thank you for your help. Jean-Patrick

Jean-Pat
  • 1,839
  • 4
  • 24
  • 41
  • Just as a note, valid folder names and valid XML tag names are not the same - '4' is a valid folder name, but XML tags can't start with a number, the same goes for beginning with punctuation or 'xml' or having spaces. Might need to be considered. – Gareth Latty Feb 05 '12 at 20:39

1 Answers1

3

You need to keep track of the parent folders, and watch where you need to add each walked directory.

# -*- coding: utf-8 -*-
import lxml.etree as et
import os
def makeNodes(current, parents, leveldirlist):
    new = {}
    for d in leveldirlist:
        child=et.Element(d)
        new[os.path.join(current, d)] = child
        parents[current].append(child)
    return new

if __name__ == '__main__':
    topdir='t1'
    projectxml=et.Element('Project')      

    parents = {topdir: projectxml}
    for current, dirs, files in os.walk(topdir):
        parents.update(makeNodes(current, parents, dirs))

    print(et.tostring(projectxml,pretty_print=True))

This produced:

<Project>
  <t2>
    <t6/>
  </t2>
  <t3>
    <t5/>
    <t4>
      <t7/>
    </t4>
  </t3>
</Project>
Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
  • @Jean-Pat No worries, feel free to accept the answer if you've found it solves your problem. – Gareth Latty Feb 05 '12 at 21:01
  • I meet a message error when I substitute topdir='t1' by topdir='/home/claire/Applications/ProjetPython/testlmx': Traceback (most recent call last): File "/home/claire/Applications/ProjetPython/testlmx/correction-dirToxml.py", line 25, in parents.update(makeNodes(current, parents, dirs)) File "/home/claire/Applications/ProjetPython/testlmx/correction-dirToxml.py", line 16, in makeNodes parents[current].append(child) UnboundLocalError: local variable 'child' referenced before assignment – Jean-Pat Feb 06 '12 at 11:12