0

I needed to parse a very large Microsoft Excel file, and after searching, I found this:

java.lang.OutOfMemoryError: GC overhead limit exceeded when loading an xlsx file

This is great. Cayman gave a great answer. I can now load the XML file and parse it.

However, how do I find the name of the columns? I don't the "c" for CELL and the "v" for VALUE. I mean the actual names of the columns (in my case "Country", "Address", "Phone", "Company").

I investigated this line:

SharedStringsTable sst = xssfReader.getSharedStringsTable();

Maybe this is suppose to reveal something about the names of columns? I could not find any examples.

Can someone point me to the documentation on this? How do I get the column names?

Community
  • 1
  • 1
JeffGallant
  • 409
  • 2
  • 6
  • 17
  • 1) `XSSFReader` doesn't read XML files. It reads `.xlsx` files, which are not XML files. 2) Think about the column names you want. Where are they in the Excel spreadsheet? That's right, *they are in row 1*. So read row 1, and get the text values of the cells in that row. Because, they are just text cells like any other text cell in the spreadsheet. – Andreas Nov 26 '16 at 01:28
  • Andreas, you have no idea what you are talking about. The xlsx format is an XML format. – JeffGallant Nov 26 '16 at 04:45
  • And I am successfully parsing it as XML, I only need to find the names of the columns, which are not in the main XML file. – JeffGallant Nov 26 '16 at 04:45
  • 1
    @JeffGallant: "The xlsx format is an XML format." No, it isn't. The `xlsx` format is an `ZIP` archive **containing** `XML` files, but also other files (pictures for example), in a specific directory structure. Rename the `*.xlsx` file `*.zip` and you can simply unzip it and have a look into it. – Axel Richter Nov 26 '16 at 08:43

0 Answers0