I have HTML strings, from which I need to extract HTML substrings (summary, keywords, ...). The strings look like:
const content = "<p>
<strong>Summary</strong><br />Some text with <strong>HTML</strong> tags...<br /><br />
<strong>Keywords</strong> keyword1, keyword2,...<br /><br />
...
</p>"
The aim is to get:
summary = "<br />Some text with <strong>HTML</strong> tags...<br /><br />"
keywords = "keyword1, keyword2,..."
For the parsing I use the library Cheerio, which enable to use jQuery methods on the parsed HTML code. I have tried e.g. the following approaches, but none of them works:
Simple nextUntil():
const $ = cheerio.load(content);
console.log($("strong:contains('Summary')").nextUntil( "strong:contains('Keywords')" ).html());
// Returns: "Summary"
nextUntil() with foreach:
const $ = cheerio.load(content);
let container = $('<container/>');
for (let i = 0; i < $("strong:contains('Summary')").nextUntil( "strong:contains('Keywords')" ).length; i++) {
container.append($("strong:contains('Summary')").nextUntil( "strong:contains('Keywords')" )[i]);
}
console.log('container: ', container.html());
// Returns: "<strong>Summary</strong>"
` elements the sibling elements? **Edit:** doh, I thought you meant my answer; ignore me! – user7290573 Aug 28 '19 at 15:00
`s are siblings of the `` elements. But, unfortunately, they are of little help when you need to extract the HTML content between them. – Carsten Massmann Aug 28 '19 at 15:03