I've written a program to read a set of source files and convert them into XML files using SrcML tool. Basically the procedure as follows.
for (------------------) {
-------------------
String xmlUri = GetXmlFile(sourceFileUri); // create xml file and get its uri
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlUri);
-------------------
}
For each source file the program creates a XML file in the same location (by overriding the previously created file) and read the XML file. For some source files this procedure works fine. But most of them it gives some SAX Parse Exceptions as follows:
- Premature end of file.
- Content is not allowed in prolog.
- The element type "argcl" must be terminated by the matching end-tag "". (this XML file doesn't even contains an element by name "argcl"
- XML document structures must start and end within the same entity.
The SrcML tool creates valid XML documents. When I check the XML file for some of these exception it doesn't show anything wrong with the format. All exceptions pointed out to the same line in the code which is:
"Document doc = dBuilder.parse(xmlUri);"
I have gone through number of discussions related to this topic in stack over flow as well as in other forums. Neither provides me a clue to overcome this problem.
I really appreciate if someone can help me to solve this problem. Thank you.
Here's the source code written to read XML file:
private static Document GetXmlDom(String xmlFilePath)
throws SAXException, ParserConfigurationException, IOException {
File tempFile;
try {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFilePath);
if (doc.hasChildNodes()) {
return doc;
}
}
catch (IOException e) {
e.printStackTrace();
throw e;
}
catch (SAXParseException e) {
e.printStackTrace();
throw e;
}
return null;
}
private static String GetXmlFile(String inputFile) throws IOException {
if (new File(inputFile).isFile()) {
String outFile = FileNameHandler.GetNextNumberedFileName(FileNameHandler.getXmlFlePath(), "outFile.xml");
Process process = new ProcessBuilder("srcML\\src2srcml.exe", inputFile,
"-o", outFile).start();
return outFile;
}
else {
System.out.println("\nNo XML file is created. File does not exist: " + inputFile);
}
return null;
}
public static List<Tag> SourceToXML(String inputFile)
throws SAXException, ParserConfigurationException, IOException {
List<Tag> tagList = new LinkedList<Tag>();
String xmlUri = GetXmlFile(inputFile);
Document doc = GetXmlDom(xmlUri);
if (doc != null) {
LinkedList<Integer> id = new LinkedList<Integer>();
id.add(1);
TagHierarchy.CreateStructuredDom(new TagId(id), doc.getFirstChild(), tagList);
tagList.get(0).setAncestor(null);
TagHierarchy.SetTagHierarchy(tagList);
}
return tagList;
}
Here's the exception thrown:
[Fatal Error] outFile.xml:461:300: The element type "argcl" must be terminated by the matching end-tag "". org.xml.sax.SAXParseException; systemId: file:/E:/srcML/Output/outFile.xml; lineNumber: 461; columnNumber: 300; The element type "argcl" must be terminated by the matching end-tag "". at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at vocab.util.file.FileConverter.SourceToXML(FileConverter.java:188) at vocab.CodeVocabulary.Create(CodeVocabulary.java:59) at vocab.CodeVocabulary.(CodeVocabulary.java:53) at vocab.util.DataAcccessUtil.GetCodeVocabularies(DataAcccessUtil.java:331) at vocab.TestMain.main(TestMain.java:57)