I am writing a program to process an old dataset in c++. I've already managed to convert the files from sgml to xml using the sx tool from James Clark. Since I have past experience using vtd-xml with Matlab (which is java based), and since vtd-xml has a c++ port, I decided to use that for my project. I am using vtd-xml version 2.12 since that was the newest version of the c++ port I could find. I managed to compile it using Visual Studio 2019 by changing all calls of wcsdup to _wcsdup and by using the _CRT_SECURE_NO_WARNINGS preprocessor definition. My program below appears to give correct output, but it also throws an exception during parsing of the xml file (a test xml file is also below). The exception is an EOFException. I don't see anything obviously wrong with my xml files, and the error is reproduced with the test xml below that is not one I converted from sgml. My intuition is that if there were a bug in the c++ port it would be easier to find information about it when Googling for vtd-xml EOFException. So, it seems to me that the changes I made to get it to compile are likely the culprit, but I can't figure out how to get rid of the exception. Any ideas would be welcome. If it comes to it, I am willing to use a different xml library for my program if it is free.
My code:
#include <iostream>
#include <fstream>
#include "VTDGen.h"
#include "autoPilot.h"
#include "customTypes.h"
using namespace std;
using namespace com_ximpleware;
int main() {
ifstream xml(".\\cd_catalog_short.xml", ios::binary | ios::ate);
ifstream::pos_type pos = xml.tellg();
long int length = static_cast<long int>(pos);
char* pChars = new char[length];
xml.seekg(0, ios::beg);
xml.read(pChars, pos);
xml.close();
UCSChar node_path[] = L"/CATALOG/CD/TITLE";
UCSChar* title;
VTDGen vg;
vg.setDoc(pChars, length);
vg.parse(false);
AutoPilot ap;
ap.selectXPath(node_path);
VTDNav* vn = vg.getNav();
ap.bind(vn);
while (ap.evalXPath() != -1) {
int ind = vn->getText();
if (ind != -1) {
title = vn->toNormalizedString(ind);
wcout << title << endl;
delete[] title;
}
}
return 0;
}
A test xml file:
<?xml version="1.0" encoding="UTF-8"?>
<CATALOG>
<CD>
<TITLE>For the good times</TITLE>
<ARTIST>Kenny Rogers</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>Mucik Master</COMPANY>
<PRICE>8.70</PRICE>
<YEAR>1995</YEAR>
</CD>
<CD>
<TITLE>Big Willie style</TITLE>
<ARTIST>Will Smith</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1997</YEAR>
</CD>
<CD>
<TITLE>Tupelo Honey</TITLE>
<ARTIST>Van Morrison</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>Polydor</COMPANY>
<PRICE>8.20</PRICE>
<YEAR>1971</YEAR>
</CD>
</CATALOG>
My program output:
Exception thrown at 0x00007FF96A36A839 in em.exe: Microsoft C++ exception: com_ximpleware::EOFException at memory location 0x0000005498B6F350.
For the good times
Big Willie style
Tupelo Honey
C:\Users\Joe\source\repos\em\x64\Release\em.exe (process 16308) exited with code 0.
To automatically close the console when debugging stops, enable Tools->Options->Debugging-> Automatically close the console when debugging stops.
Press any key to close this window . . .