1

So I have an excel file which looks something like this:

enter image description here

What I want is to be able to load the excel file with apache poi where I can filter specific cellvalues from specific columns.

for eg. I only want to get rows where for eg. the "Spuranzahl" is 2 and "Fahrbahnbreite" <34.

Right now I load the ENTIRE Excelfile using XSSF Worksheet, but is it possible to only load SPECIFIC rows. smth like an query before I load the file?

As my file is quite large it takes some time to load everyting. That is the reason I want to load specific data.

XtremeBaumer
  • 6,275
  • 3
  • 19
  • 65
plshm
  • 288
  • 3
  • 20
  • 2
    You can't filter content before reading, there is no way around it – XtremeBaumer Jul 13 '18 at 11:43
  • @XtremeBaumer hmm ok. is it possible to ave two different cell ranges for autofilter. for example. for example : i want all rows from range A1:C1 and E1:G1. so at the end I get all rows only containing only 6 cells? – plshm Jul 13 '18 at 11:55
  • Again, not before the file is entirely loaded. After that you can collapse/delete the columns you don't need – XtremeBaumer Jul 13 '18 at 12:03

1 Answers1

1

you can filter while reading, although might be slower. Maybe index it in chunks with temporary index files (I haven't used apache so I don't know how apache does this).

Save the items of the first line in a map for fast lookup (header row -> for keyname and index respectively), then iterate every row, split it, and check if the index of key X matches your filter. Save that row and break or continue for other rows.