2

I have a strange error reading an xlsx file, on my windows computer I have no problem, instead on heroku it gives me the following error: unable to parse shared strings table. Obviously the file I try to read is the same.

My code:

private fun readExcelContent(file: InputStream): Sequence<Row> {
    val pkg = OPCPackage.open(file)
    pkg.use {
        val reader = XSSFReader(pkg)
        val sst = reader.sharedStringsTable
        val parser = XMLHelper.newXMLReader()
        val handler = ExcelSheetHandler(sst)
        parser.contentHandler = handler
        val sheet = reader.sheetsData.next()
        val source = InputSource(sheet)
        try {
            parser.parse(source)
        } catch (e: ConsecutiveBlanksException) {
            logger.info("The file contained several consecutive blank lines that have been skipped.")
        }

        return handler.content.asSequence()
    }
}

stacktrace:

Feb 16 21:56:15.622 scheduler.1 at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327) ~[na:na] Feb 16 21:56:15.622 scheduler.1 at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400) ~[na:na] Feb 16 21:56:15.622 scheduler.1 at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:178) ~[na:na] Feb 16 21:56:15.622 scheduler.1 at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204) ~[na:na] Feb 16 21:56:15.622 scheduler.1 Caused by: org.xml.sax.SAXParseException: The element type "t" must be terminated by the matching end-tag "". Feb 16 21:56:15.622 scheduler.1 ... 36 common frames omitted Feb 16 21:56:15.622 scheduler.1 at org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:123) ~[poi-ooxml-4.1.2.jar!/:4.1.2] Feb 16 21:56:15.622 scheduler.1 at org.openxmlformats.schemas.spreadsheetml.x2006.main.SstDocument$Factory.parse(Unknown Source) ~[poi-ooxml-schemas-4.1.2.jar!/:4.1.2] Feb 16 21:56:15.622 scheduler.1 at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345) ~[xmlbeans-3.1.0.jar!/:na] Feb 16 21:56:15.622 scheduler.1 at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1259) ~[xmlbeans-3.1.0.jar!/:na] Feb 16 21:56:15.622 scheduler.1 at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1272) ~[xmlbeans-3.1.0.jar!/:na] Feb 16 21:56:15.622 scheduler.1 at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3448) ~[xmlbeans-3.1.0.jar!/:na] Feb 16 21:56:15.622 scheduler.1 Caused by: org.apache.xmlbeans.XmlException: error: The element type "t" must be terminated by the matching end-tag "". Feb 16 21:56:15.622 scheduler.1 ... 34 common frames omitted Feb 16 21:56:15.622 scheduler.1 at org.apache.poi.ooxml.POIXMLFactory.createDocumentPart(POIXMLFactory.java:61) ~[poi-ooxml-4.1.2.jar!/:4.1.2] Feb 16 21:56:15.622 scheduler.1 at org.apache.poi.xssf.model.SharedStringsTable.(SharedStringsTable.java:111) ~[poi-ooxml-4.1.2.jar!/:4.1.2] Feb 16 21:56:15.622 scheduler.1 at org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:134) ~[poi-ooxml-4.1.2.jar!/:4.1.2] Feb 16 21:56:15.622 scheduler.1 Caused by: java.io.IOException: unable to parse shared strings table Feb 16 21:56:15.622 scheduler.1 ... 29 common frames omitted

Alessandro
  • 33
  • 5
  • Ideally, you should provide the stacktrace. – PJ Fanning Feb 16 '21 at 20:41
  • 1
    it does look like something is truncating the stream or that the sharedStrings.xml in the xlsx file (xlsx files are just zip files when they are not password protected) has been corrupted -- org.xml.sax.SAXParseException: The element type "t" must be terminated by the matching end-tag "" – PJ Fanning Feb 16 '21 at 21:33
  • ok but how is possible that in my locale i can read this file? – Alessandro Feb 16 '21 at 21:55
  • opening the file with google sheets everything seems ok – Alessandro Feb 16 '21 at 21:57
  • 1
    According the stack trace there must be a really old or buggy version of `org.xml.sax` used. The error `The element type "t" must be terminated by the matching end-tag "".` makes no sense at all. The end tag would must be `""` and not `""`, but an empty `t` element `` also is possible. So what `Java` version is used on `heroku`? [POI 4.0 and later require JDK version 1.8 or later.](http://poi.apache.org/devel/). – Axel Richter Feb 17 '21 at 04:58
  • version 11.0.10 – Alessandro Feb 17 '21 at 11:50

0 Answers0