1

First, I saw this question ( Getting A File's Mime Type In Java ), but the selected answer is only a link.

I'm looking for a way to determine the MIME type of a file in Java, and I need a tool that can discover a lot of different types because I'm doing an webcrawler and it handle a lot of different MIMEs.

I used JMimeMagic and appears simple and good, but is beta and there are some crashes. Apache Tika does a lot of things, including MIME detection, but is big. The same occurs for some others libraries.

I want to know if there is some MIME detection specific library (like JMimeMagic, but working) and that can recognizes a lot of MIME types, and not using only file extension. If not, the bigger libraries like Apache Tika are the right choice?

Community
  • 1
  • 1
Renato Dinhani
  • 35,057
  • 55
  • 139
  • 199

1 Answers1

1

Apache Tika is the most comprehensive choice so far. I would suggest to go with it.

Drona
  • 6,886
  • 1
  • 29
  • 35
  • I didn't know Tika before, but a brief read at the site leads me to think Tika is neither comprehensive nor appropriate for this case. It appears to be used to extract content from a fairly limited set of file types, not to determine the mime-type of a file. Compared to `magic` and specifically JMimeMagic, which support many hundreds of types. I had looked at JMimeMagic before, and agree it's too broken to use in a production system. – Stephen P Mar 27 '12 at 16:45
  • I have personally used Tika for a similar use case and found it to be fairly good and strongly recommend to use it. – Drona Mar 27 '12 at 16:53
  • Tika, in addition to mime type detection, does lots of other things. It does support most of the file types. It is being used for mime detection and content analysis in the Apache's Lucene Project. – Drona Mar 28 '12 at 03:02