The Lemur code relies on some quirks of the version of gcc/C++ standard it was built for (g++ 4.4, I believe).
These are the changes I needed to make lemur-4.12 under gcc 7.4. This isn't based on any deep understanding of the Lemur code; I'm just logging the changes as I make them:
Most of the changes just require adding explicit casts or this->
s
utility/include/CSet.hpp:63 int idx = this->operator=[](u);
utility/include/ISet.hpp:90 int hashval = this->computeHash(u);
utility/include/ISet.hpp:104 const int hashval = this->computeHash(sn->u);
utility/include/ISet.hpp:105 typename PSet<ObjType>::SET_NODE *snNew = this->createNode(sn->u);
utility/include/ISet.hpp:109 this->deleteNode(sn);
retrieval/src/ResultFile.cpp:134 return (bool)(*inStr >> curQID >> dummy1 >> curDID >> dummy2 >> curSC >> dummy3);
retrieval/src/ResultFile.cpp:136 return (bool)(*inStr >> curQID >> curDID >> curSC);
utility/src/BasicDocStream.cpp:78 moreDoc = (bool)(*ifs >> buf);
Was comparing a non-pointer to NULL. Fortunately, they commented about what they were trying to do.
utility/src/WordSet.cpp:42 if (ifstr.fail()) {
For these two, they were returning false
as a pointer; (bool)NULL
is false, so NULL
is probably what they meant.
utility/src/BulkTree.cpp:571 return NULL;
utility/src/BulkTree.cpp:587 return NULL;
This one's just a bug. They were comparing a pointer to the NUL character. I suspect they meant to compare the character it pointed to to NUL, but since it should probably stop reading if either c == NULL or *c == '\0', I just made it check for both.
parsing/src/OffsetAnnotationAnnotator.cpp:194 for ( const char* c = str; i < n && c && *c != '\0'; c++, i++ )
You need to run make
twice for it to build everything. Not sure why.
I also recommend setting export CXXFLAGS='-Wno-write-strings -Wno-deprecated
before running configure
. There are too many warnings of those types to fix them all, and they're potentially hiding more critical warnings. Which reveals:
Possible issue: That \0 doesn't do get stored in qChar; it just terminates the format string. I'm leaving it alone for now, since that would never have worked differently no matter the version of gcc, so the code that uses qChar probably doesn't need the extra \0 to be there.
site-search/cgi/DBInterface.cpp:628 sprintf(qChar, "#q1=%s\0", query.c_str());
They were using the wrong format specifier for size_t
site-search/cgi/IndriSearchInterface.cpp:815 fprintf(oQueryLog, "(%lu results)\n", (unsigned long)finalResults.size());
Note that these are just the changes needed to make it compile. It's entirely possible that it also depends on undefined behavior behaving the way it did under gcc 5.4. I'd definitely recommend turning on --enable-assert when running configure
.