1

How to parse HDF files(.h5) using Apache Tika.

Apache Tika provides parser for .h5 files, but Using that I'm not able to parse the data.

Parser parser=new HDFParser();
Metadata metadata=new Metadata();
ContentHandler handler=new BodyContentHandler();
FileInputStream fileInputStream=new FileInputStream(path+h5File);

parser.parse(fileInputStream,handler,metadata,new ParseContext());

I can see metadata of file, but I can't get content using handler.

If anyone has done this, Please help me through this.

ketankk
  • 2,578
  • 1
  • 29
  • 27

1 Answers1

2

Simply you can't for the nature of HDF format file.

You have to use metadata.get(field-name-in-string-format); for retrieving information you want.

Alternatively you can try directly this Java library: NetCDF (which it is used, under the hood, by Tika)

Nicomedes E.
  • 1,326
  • 5
  • 18
  • 27