2

I'm using XercesC Lib to create a serialization of my data. How can I set it to UTF-8? It is always generated with UTF-16 and I can't find a way to change that.

xercesc::DOMImplementation *gRegistry = xercesc::DOMImplementationRegistry::getDOMImplementation(X("Core"));
xercesc::DOMDocument *doc = gRegistry->createDocument(
        0,                      // root element namespace URI.
        X(oDocumentName.c_str()),       // root element name
        0);                 // document type object (DTD).
doc->setXmlStandalone(true);
... prepare the document ...
serializer = ((xercesc::DOMImplementationLS *)gRegistry)->createLSSerializer();
serializer->setNewLine(xercesc::XMLString::transcode("\n"));

XMLCh *xmlresult = serializer->writeToString(doc);
char *temp = xercesc::XMLString::transcode(xmlresult);
std::string result(temp);

xercesc::XMLString::release(&temp);
xercesc::XMLString::release(&xmlresult);
doc->release();
serializer->release();
getStream() << result.c_str();

When I deserialize with JAXB on the Java side, I always get a content is not allowed in prolog and so far this is the only difference I can see in the XML. When I try to locally deserialze in JAXB it works. When I take my XercesC XML I get this error. When I try to format it in Notepad++ with the XML plugin it also says that there is an error, but doesn't tell me any details.

Devolus
  • 21,661
  • 13
  • 66
  • 113
  • I'm not sure, but perhaps you should build XercesC with ICU lib. See instructions for trans-coder options here http://xerces.apache.org/xerces-c/build-3.html – Sergei Nikulov Jun 23 '13 at 16:16

1 Answers1

3

Check the usage of DOMLSOutput, that should give you exactly what you want. I.e. you create a DOMLSOutput object to which you write (instead of using DOMLSSerializer::writeToString).

Robert
  • 2,330
  • 29
  • 47
  • Yes, I'm currently looking into this. According to this doc http://www.ibm.com/developerworks/library/x-serial.html, the serializer can only do UTF-16. I'm trying to implement the `MemBufFormatTarget` but I don't know how to serialize the document with that. – Devolus Jun 23 '13 at 16:40
  • Pretty much you'd create a `MemBufFormatTarget` and hook it up to the `DOMLSOutput`, then after that you do `DOMLSSerializer::write` to the `DOMLSOutput` object (with DOMNode being the root element of the document). Then you call `MemBufFormatTarget::getRawBuffer` and use `TranscodeFromStr` to get your XMLCh string. – Robert Jun 23 '13 at 17:00
  • Thanks. That `TranscodeFromStr` was the missing bit. It works now. :) – Devolus Jun 23 '13 at 17:50