0

I have an XML file of which I need to extract data from. I use tinyxml2. I tried to get the text of an element in the file. The data in the file looks like this:

<Tasks>

<!-- \GoogleUpdateTaskMachineCore -->
<?xml version="1.0" encoding="UTF-16"?>

<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">

  <RegistrationInfo>

    <Version>1.3.33.7</Version>

    <Description>Keeps your Google software up to date. If this task is disabled or stopped, your Google software will not be kept up to date, meaning security vulnerabilities that may arise cannot be fixed and features may not work. This task uninstalls itself when there is no Google software using it.</Description>

  </RegistrationInfo>

  <Actions Context="Author">

    <Exec>

      <Command>C:\Program Files (x86)\Google\Update\GoogleUpdate.exe</Command>

      <Arguments>/c</Arguments>

    </Exec>

  </Actions>

</Task>

</Tasks>

I tried to extract Description with the following code, however it doesn't work.

tinyxml2::XMLDocument xmlDoc;
xmlDoc.LoadFile("raw");

XMLNode *pLoadTask = xmlDoc.FirstChild();
XMLElement * pTask = pLoadTask->FirstChildElement("Task")->FirstChildElement("RegistrationInfo")->FirstChildElement("Description");

What am I doing wrong?

  • 1
    The first child element is `Tasks`. `Task` is nested within it. – Jonathan Potter Apr 26 '18 at 10:40
  • I have create a XMLNode before, `XMLNode *pLoadTask = xmlDoc.FirstChild();` – Poseidon Security Apr 26 '18 at 16:53
  • Use `FirstChildElement`, see https://stackoverflow.com/questions/43285461/parsing-xml-files-root-nodes-has-no-child-nodes/43289066#43289066 for more info. – stanthomas Apr 26 '18 at 22:32
  • Also, what is `` doing outside the document? `` should be first for a properly fomed XML document. – stanthomas Apr 26 '18 at 22:35
  • I think it is `Root`. Because there are many `` and `` as well in the document. – Poseidon Security Apr 27 '18 at 07:38
  • I think it would be best to modify the XML to make it look like everyone else's. In particular, have one, and only one, XML declaration as the very first line. See https://en.wikipedia.org/wiki/XML details. Once you get your program working you can go back to the original. – stanthomas Apr 27 '18 at 11:06
  • "The document type declaration MUST appear before the first element in the document." : https://www.w3.org/TR/xml/#sec-well-formed . You might *like* to be the root element but you don't know how tinyXML2 has parsed the document. It should have rejected it but maybe it's being helpful :) – stanthomas Apr 27 '18 at 11:13
  • Hi @stanthomas, as you said then the Root is Task? Could you give an example? – Poseidon Security May 02 '18 at 09:22
  • This command use to get root element `XMLNode *pLoadTask = xmlDoc.FirstChild();`, however the program is crashed – Poseidon Security May 02 '18 at 12:18
  • You need to provide more information. Have you corrected the XML so that it is well-formed? You say "This command use to get root element..." but it is quite clearly a `node`; everything in `tinxml2` is a `node`, only some of them are `elements`. Where, exactly, does the program crash? You might try printing out the node name as you step down thru them, starting with the document node. – stanthomas May 02 '18 at 18:57
  • I try to get Element by `XMLNode *pLoadTask = xmlDoc.FirstChild();` and `XMLElement *pEle = xmlDoc.FirstChildElement("Task")` but these function also return NULL Pointer. This is full of file raw: http://www.mediafire.com/file/d2wg8qc353ai3x3/raw – Poseidon Security May 03 '18 at 06:31

0 Answers0