1

I have this simple, but not very well formatted html page with all it's mistakes:

<HTML>
<head>
  <title>Official game sheet</title>
</head>
<body class="sheet">
</BODY>
</HTML>

Tried to apply an xpath //title on the document parsed from this html.

const document = parse5.parse(xmlString);
const xhtml = xmlser.serializeToString(document);
const doc = new dom().parseFromString(xhtml);
const select = xpath.useNamespaces({
  "x": "http://www.w3.org/1999/xhtml"
});
const nodes = select("//title", doc);
console.log(nodes);

Tried the solution from here without success. The returned nodes list is empty.

Here you can see the problem.

neptune
  • 1,211
  • 2
  • 19
  • 32

2 Answers2

2

Here you go @neptune, you don't need parse5 nor xmlser all what is needed is xpath and xmldom.

var xpath = require('xpath');
var dom = require('xmldom').DOMParser;
var xmlString = `
<HTML>
<head>
  <title>Official game sheet</title>
  <custom>Here we are</custom>
<body class="sheet">
</BODY>
</HTML>`;

//const document = parse5.parse(xmlString);
//const xhtml = xmlser.serializeToString(document);
const doc = new dom().parseFromString(xmlString);
const nodes = xpath.select("//custom", doc);
//console.log(document);

console.log(nodes[0].localName + ": " + nodes[0].firstChild.data);
console.log("Node: " + nodes[0].toString());
Joseph
  • 36
  • 2
1

please correct the lines to get title

const nodes = select("//x:title//text()", doc);
console.log(nodes[0].data)
Abdulla Thanseeh
  • 9,438
  • 3
  • 11
  • 17