0

So let's say I have an ordered or unordered list that is in string form. I need only to return the text that is contained within each list item of the respective parent element:

<ul>
    <li>Example One</li>
    <li>Example Two</li>
</ul>

I have made this work, but obviously not very efficient:

var first = string.replace(/.*<li>(.*)<\/li>.*/g, '$1');
var second = first.replace(/(<ul>|<ol>|<\/ol>|<\/ul>)/g, '');

Output is what I expect, but I know there is a regex format that will accomplish at once, but I am still pretty green in regards to regex so not sure what I am doing wrong. This is what I thought would work:

var newString = string.replace(/(<ul>|<ol>).*<li>(.*)<\/li>.*(<\/ul>|\/<ol>)/g, '$2');

However, this returns the entire HTML structure as a string:

<ul>
    <li>Example One</li>
    <li>Example Two</li>
</ul>

As always my friends, thank you in advance.

Austin737
  • 675
  • 1
  • 8
  • 15
  • https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 ;) – Thomas Apr 09 '20 at 07:24
  • 1
    If you're using Javascript, why would you use regex and not a more sensible method like DOM parsing? – CertainPerformance Apr 09 '20 at 07:26
  • "ordered or unordered list that is in string form"...string form – Austin737 Apr 09 '20 at 07:26
  • @CertainPerformance I guess I should have included more context. This is utilizing a javascript templating engine, very much similar to Mustache, so server-side I need to manipulate the string form of the HTML element before it is subsequently rendered client-side. Otherwise I would absolutely love to parse DOM and do what I need :) – Austin737 Apr 09 '20 at 07:29
  • Server-side, you can use Cheerio to parse HTML strings – CertainPerformance Apr 09 '20 at 07:29
  • @CertainPerformance external libraries are not an option for me unfortunately, but I thank you for the recommendation as I love Cheerio – Austin737 Apr 09 '20 at 07:32
  • Why are external libraries not an option? Are you against them? They need no special inclusion other than your host provides unless they use native machine code libraries. The ones mention in comments only require an include. No headaches. Besides if you want `
  • ` elements inside a `
      ` perhaps regex is the wrong way.
  • – GetSet Apr 09 '20 at 07:40
  • @GetSet I am very much for external libraries but I am constrained by the CMS platform that I am building for. Regex is necessary in my scenario as I need to extract the text from within each
  • element from within
      or
      that is in string form and set to server-side vars to subsequently rendered in a template as DOM elements
  • – Austin737 Apr 09 '20 at 07:49
  • Ok OP. But you do realize your problem is *not* solved with the answer you accepted? Be diligent and test the code we post as answers. .... `Hello
    • Example One
    • Example Two
    World` as input fails the test because Hello and World shows.
    – GetSet Apr 09 '20 at 07:50
  • @GetSet yes I see that you are correct, and again it is my own fault for not providing enough details in my original question. The reason that the accepted answer works for me is due to that fact that the content that will be entered within the CMS that I am building for will be in a field that constrains it as so it can only begin with either a
      or
      and end with a or
    ...although if I am still mistaken I am very much appreciative of your input as it is already helping me to have a better understanding of regex
    – Austin737 Apr 09 '20 at 08:00
  • 1
    Ok Austin, that makes sense now. Ok my fault. Somebody downvoted. I restored it because u absolutely make sense now. – GetSet Apr 09 '20 at 08:08