1

I'm trying to group a bunch of files together based on RecipeID and StepID. Instead of storing all of the filenames in a table I've decided to just use glob to get the images for the requested recipe. I feel like this will be more efficient and less data handling. Keeping in mind the directory will eventually contain many thousands of images. If I'm wrong about this then the below question is not necessary lol

So let's say I have RecipeID #5 (nachos, mmmm) and it has 3 preparation steps. The naming convention I've decided on would be as such:

5_1_getchips.jpg
5_2_laycheese.jpg
5_2_laytomatos.jpg
5_2_laysalsa.jpg
5_3_bake.jpg
5_finishednachos.jpg
5_morefinishedproduct.jpg

The files may be generated by a camera, so DSC###.jpg...or the person may have actually named each picture as I have above. Multiple images can exist per step. I'm not sure how I'll handle dupe filenames, but I feel that's out of scope.

I want to get all of the "5_" images...but filter them by all the ones that DON'T have any step # (grouped in one DIV), and then get all the ones that DO have a step (grouped in their respective DIVs).

I'm thinking of something like

foreach ( glob( $IMAGES_RECIPE .  $RecipeID . "-*.*") as $image)

and then using a substr to filter out the step# but I'm concerned about getting the logic right because what if the original filename already has _#_ in it for some reason. Maybe I need to have a strict naming convention that always includes _0_ if it doesn't belong to a step.

Thoughts?

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
Bodi
  • 271
  • 3
  • 10

2 Answers2

2

Globbing through 1000s of files will never being faster than having indexed those files in a database (of whatever type) and execute a database query for them. That's what databases are meant for.

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • @Bodi Yeah believe me. I don't know a filesystem which is optimized for such queries. Globbing would always mean to iterate trough all the directory entries while an indexed database solution could offer a far better performance. – hek2mgl May 05 '15 at 21:09
  • So then should I store the filenames as "RecipeID_filename.jpg" so that duplicates are less of an issue, or should I rename all uploaded files to an incremental ID and store the original filename in that row? Edit: I will also have a folder for each of the resized image sizes (30px, 40, 60, 100, 150). The filename should be identical. – Bodi May 05 '15 at 21:36
  • It depends. I prefer to keep the original file names, at least as a prefix with an id appended. This way it is easier to browse through a directory manually if something fails... – hek2mgl May 05 '15 at 21:42
0

I had a similar issue with 15,000 mp3 songs.

In the Win command line dir

dir *.mp3 /b /s > mp3.bat

Used a regex search and replace in NotePad++ that converted the the file names and prefixed and appended text creating a Rename statement and Ran the mp3.bat.


Something like this might work for you in PHP:

  1. Use regex to extract the digits using preg_replace to
  2. Create a logic table(s) to create the words for the new file names
  3. create the new filename with rename()

Here is some simplified and UNTESTED Example code to show what I am suggesting.

Example Logic Table:

 $translation[x][y][z] = "phrase";
 $translation[x][y][z] = "phrase";
 $translation[x][y][z] = "phrase";
 $translation[x][y][z] = "phrase";

  $folder = '/home/user/public_html/recipies/';
  $dir=opendir($folder);
  while (false !== ($found=readdir($dir))){
    if pathinfo($file,PATHINFO_EXTENSION) == '.jpg')
     { 
       $files[]= pathinfo($file,PATHINFO_FILENAME);
     }
  }
  foreach($files as $key=> $filename){
    $digit1 = 'DSC(\d)\d\d\.jpg/',"$1", $filename);
    $digit2 = 'DSC\d(\d)\d\.jpg',"$1", $filename);
    $digit3 = 'DSC\d\d(\d)\.jpg',"$1", $filename);
    $newName = $translation[$digit1][$digit2][$digit3]
    ren($filename,$newfilename);
  }
Misunderstood
  • 5,534
  • 1
  • 18
  • 25