1

I want to read a ppt file i tried to read file using apache.poi library API. This is what i tried .

POIFSFileSystem posF = new POIFSFileSystem(fileInputStream);

it throws the following error

java.io.IOException: Invalid header signature; read 4851293027410584380, expected -2226271756974174256
at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:112)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)

This question has been asked on stackoverflow many times and i tried all the suggested solutions but it was of no use.

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186
manishKumarSingh
  • 103
  • 2
  • 10
  • Thanks for the reply. i tried following code to check with Apache Tika. POIFSContainerDetector detector = new POIFSContainerDetector(); MediaType mdType= detector.detect(fis, new Metadata()); but it throws java.io.IOException: mark/reset not supported at java.io.InputStream.reset(Unknown Source) at org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:158) any further help would be highly appreciated. – manishKumarSingh Dec 13 '12 at 08:18

1 Answers1

0

That error tells you that your file isn't actually a PPT file after all. (It's not an OLE2 file, which is the underlying format that .PPT is based on)

To work out what your file actually is, I'd suggest using either the file utility on a nearby unix box, or use Apache Tika with the TikaCLI and --detect. That should help you work out what your file is (hint - it's not a .ppt hence the error!) then you can identify what library to use to open it with

Gagravarr
  • 47,320
  • 10
  • 111
  • 156