2

I noticed the rapidxml parses illegal <<element/> to an element named <element, instead of producing an error.

I think the problem is the definition of lookup_node_name. The comment is

//  Node name (anything but space \n \r \t / > ? \0)

What I understand from the w3.org specification is that a name can have letters, numbers, and a few other characters.

I'm not sure what will be a correct fix. Any suggestions?

Roddy
  • 66,617
  • 42
  • 165
  • 277
ModdyFire
  • 706
  • 3
  • 9
  • 19

1 Answers1

1

From looking at the rapidxml code, lookup_node_name is a lookup table of valid name characters, and as the comment says, only a specific few are prohibited.

I'd try adding '< to the list of prohibited characters by setting the lookup entry for ASCII char 0x3C from 0 to 1. ie, on the line relating to chars 0x30..0x3f, change it from this...

      // 0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
...
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  0,  0,  // 3

to this:

         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  0,  1,  0,  0,  // 3

That may work for you, but I haven't tried it. I see you've tried to contact the developer via sourceforge, which is probably the best approach...

Roddy
  • 66,617
  • 42
  • 165
  • 277
  • It will work for this particular input, true, ... but I'd still love a response from the developer; especially if I want to submit a fix. – ModdyFire Jun 13 '12 at 09:40