-1

I have lots of image files and I want to search specific file by its name. I'm looking for suggestions to implement this using Java.

Note: I have used Apache lucene but it didn't work for image files. I think it searches by file content.

Please suggest what is the best technology that can be used to search files by name in large volumes of image files (in TB).


EDIT

Example:

User enters 'Engine', it should give results as like:

X60_031004_P05_16_AJ126SC_ENGINE_COVER_AWD_2.jt X60_031004_P05_16_AJ127SC_ENGINE_COVER.jt

tgr
  • 3,557
  • 4
  • 33
  • 63
Anupam Pawar
  • 231
  • 3
  • 16

2 Answers2

0

You can use a org.apache.commons.io.FileUtils, like so:

    File root = new File("C:\\");
    String fileName = "Engine";
    String[] extensions = {"jt"};
    boolean recursive = true;
    Collection files = FileUtils.listFiles(root, extensions, recursive);
    for (Iterator iterator = files.iterator(); iterator.hasNext();) {
        File file = (File) iterator.next();
        if (file.getName().toLowerCase().contains(fileName.toLowerCase()))
            System.out.println(file.getAbsolutePath());
    }
  • I have not tried FileUtils but I think FileUtils will not scale on time line by considering drive size or volumes of files. Though I will give a try . – Anupam Pawar Oct 13 '17 at 16:41
0

You can use Lucene for searching file names or in general image meta data. And it's probably the better solution than FileUtils especially if you wanna have all the nice "SearchEngine" features.

I do not have any experience with this kind of requirement but i would do it like this:

  • Metadata extraction with Apache Tika (https://tika.apache.org/)
  • Metadata indexing and searching with Apache Lucene
  • Dedicated Storage for the images itself with a reference inside the Lucene index
dom
  • 732
  • 7
  • 19
  • Thanks. I tried with indexing and searching with Apache Lucene but it didn't worked when I point data directory to image/it files drive path. I tried it on PDF files and it worked, I think lucene output the file names based on file contents as image file not having any contents (text ) , not returning me file name/path – Anupam Pawar Oct 13 '17 at 16:45
  • well you can index the path as a separate field. For my understanding you can define your own index structure. so you take informations from tika and index them in any field, lets say a stringfield which you call path or something. And if you don't want to search for this informations you can just use a StoredField. – dom Oct 16 '17 at 11:27