0

I am trying to parse the below sample XML file, to get some data out ot it. Below is the XML file:

<Benchmark xmlns="http://checklists.nist.gov/xccdf/1.1" xmlns:xsi="www.w3.org/2001/XMLSchema-instance" id="SAP-HANA" resolved="1" xml:lang="en-US">
 <status date="2016-03-17">draft</status>
 <title xmlns:xhtml="http://www.w3.org/1999/xhtml" xml:lang="en-US">Guide to the Secure Configuration of SAP HANA</title>
 <version>0.1.28</version>

 <Profile id="profile1">
    <title xmlns:xhtml="http://www.w3.org/1999/xhtml" xml:lang="en-US">text1</title>
    <select idref="This is rule 1" selected="true"/>
    <set-value idref="ssfs_master_key_timeout">20</set-value>
 </Profile> 

 <Profile id="profile2">
    <title xmlns:xhtml="http://www.w3.org/1999/xhtml" xml:lang="en-US">text2</title>
    <select idref="this is rule1" selected="true"/>
    <select idref="this is rule1" selected="true"/>
    <select idref="this is rule1" selected="true"/>
 </Profile>
</Benchmark>

From this XML file I need to get all the profiles (profile1, profile2...) and then for each profile's title tag, I need to get its text content. I am trying to achive somehting like this:

for all profile in XML{
    get its attribute "id".
    get its <title> tag's text content.
}

For example below is the expected output:

profile1
text1
profile2
text2  // but in my code, it is always coming text1. I am not aware of, how to place [i] 

I am able to get the id. But not able to get the text content for its tag. Here is the my code:

var fs = require('fs'); 
var et = require('elementtree'); 
var XML = et.XML;
var ElementTree = et.ElementTree;
var element = et.Element;
var subElement = et.SubElement;

var data, etree;

data = fs.readFileSync('my.xml').toString();
etree = et.parse(data);
var length = etree.findall('./Profile').length;
for (var i = 0; i < length; i++) {
    console.log(etree.findall('./Profile')[i].get('id'));
    console.log(etree.findtext('./Profile/title'));  // dont know, where to place [i]

//  var profile = etree.findall('./Profile')[i].get('id');
//  console.log(etree.findtext('./Profile'[@id=’profile’]'/title'));

    //console.log(etree.findall('./Profile'[i]'/title'));
    //console.log(list[i]);
}
Hemant Yadav
  • 307
  • 1
  • 4
  • 19

1 Answers1

2

You can get the text like this:

console.log(etree.findall('./Profile')[i].find('title').text);

But I would also refactor the code a bit to not to call .findall multiple times like this:

var profiles = etree.findall('./Profile');

for (var i = 0; i < profiles.length; i++) {
 var profile = profiles[i];

 console.log(profile.get('id'));
 console.log(profile.find('title').text);
}

Hope this helps.

Antonio Narkevich
  • 4,206
  • 18
  • 28
  • Hi Antonio, I ran into one more problem with elementTree module. Can you please suggest any good nodejs module for problem asked in below link. I may need to do further deep dive into xml parsing later in time. I may run into problem anytime again. Or any good documentation for elemenTree module. http://stackoverflow.com/questions/42553153/nodejs-elementtree-npm-xml-parsing-and-merging – Hemant Yadav Mar 02 '17 at 10:46
  • @HemantYadav Answered that question. I would go for xml2js. I did not try it myself but it's like 2.3k (whereas elementtree is only 88 stars) stars at GitHub. – Antonio Narkevich Mar 03 '17 at 07:34
  • Hi @Antonio Narkevich, any help in this que. https://stackoverflow.com/questions/45104116/nodejs-elementtree-npm-doesnt-handle-comments-in-xml – Hemant Yadav Jul 14 '17 at 13:25