0

I am using XML::Simple in one of my perl scripts to extract some messages, problem is I am getting this error :

No semi-colon found after entity name [Ln: 1, Col: 151]

when I use XMLin( $msg ) and $msg contains an invalid character like '&'

I know I can use regex to remove them, but I don't want to. I can use replace & with &.

Is there a simple way in perl to deal with this kind of invalid characters in strings when I use XMLin( $msg ) ?

Example for $msg can be like : <Error>Exception Invalid address (&F5F5F5F5)</Error>

SomeDude
  • 13,876
  • 5
  • 21
  • 44

1 Answers1

2

No.

XML::Simple doesn't actually parse XML. It uses one of many other parsers to do so. If it had such an option, it wouldn't always work, if ever.

XML::Simple, therefore, doesn't provide such an option.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Thanks. I think what is the root cause in my case is : I am getting this string from a java program where it parses another xml. There it was calling node.getTextContent() and that call is not preserving the character `&` it is converting it into `&` and giving me the string. Later when I feed it to XMLin it fails. How to get the text as it is from the node ? Do you have any idea? – SomeDude Mar 17 '16 at 15:11
  • @svasa you should ask a new question in the _java_ tag and ask there, but make sure to be specific. – simbabque Mar 17 '16 at 16:59
  • That's not the root cause. `getTextContent` is behaving correctly. It resolves entities because the caller shouldn't have to care about how it the text was encoded. Why are you passing the text of an XML node to `XMLin`? Do you (weirdly) have XML embedded in your XML? If so, the original should look like `<Error>...(&amp;F5F5F5F5)&</Error>` instead of `<Error>...(&F5F5F5F5)&</Error>`. – ikegami Mar 17 '16 at 17:14