6

I want to extract only images from a zip file but i also want it to extract images that are found in subfolders as well.How can i achieve this based on my code below.Note: i am not trying to preserve directory structure here , just want to extract any image found in zip.

//extract files in zip
for ($i = 0; $i < $zip->numFiles; $i++) {
    $file_name = $zip->getNameIndex($i);
    $file_info = pathinfo($file_name);
    //if ( substr( $file_name, -1 ) == '/' ) continue; // skip directories - need to improve
    if (in_array($file_info['extension'], $this->config->getValidExtensions())) {
        //extract only images
        copy("zip://" . $zip_path . "#" . $file_name, $this->tmp_dir . '/images/' . $file_info['basename']);
    }
}
$zip->close();

Edit

My code works fine all i need to know is how to make ziparchive go in subdirectories as well

Oliveira
  • 129
  • 7
user2650277
  • 6,289
  • 17
  • 63
  • 132
  • Why do you think your code doesn't go into subdirectories? I have created `a.zip` with files `a/b/c.png`, `d.png`. Your code extracted both `d.png` and `c.png` from `a.zip` into the destination directory. Then it is unclear what is the expected behavior. – Ruslan Osmanov Nov 05 '16 at 04:58
  • @Ruslan Osmanov you were right , the code works fine ...the error i was getting was completely unrelated – user2650277 Nov 05 '16 at 07:31
  • @Ruslan Osmanov post as an answer so that i can accept – user2650277 Nov 05 '16 at 07:32

4 Answers4

1

Your code is correct. I have created a.zip with files a/b/c.png, d.png:

$ mkdir -p a/b
$ zip -r a.zip d.png a
  adding: d.png (deflated 4%)
  adding: a/ (stored 0%)
  adding: a/b/ (stored 0%)
  adding: a/b/c.png (deflated 8%)

$ unzip -l a.zip 
Archive:  a.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
   122280  11-05-2016 14:45   d.png
        0  11-05-2016 14:44   a/
        0  11-05-2016 14:44   a/b/
    36512  11-05-2016 14:44   a/b/c.png
---------                     -------
   158792                     4 files

The code extracted both d.png and c.png from a.zip into the destination directory:

$arch_filename = 'a.zip';
$dest_dir = './dest';
if (!is_dir($dest_dir)) {
  if (!mkdir($dest_dir, 0755, true))
    die("failed to make directory $dest_dir\n");
}

$zip = new ZipArchive;
if (!$zip->open($arch_filename))
  die("failed to open $arch_filename");

for ($i = 0; $i < $zip->numFiles; ++$i) {
  $path = $zip->getNameIndex($i);
  $ext = pathinfo($path, PATHINFO_EXTENSION);
  if (!preg_match('/(?:jpg|png)/i', $ext))
    continue;
  $dest_basename = pathinfo($path, PATHINFO_BASENAME);
  echo $path, PHP_EOL;
  copy("zip://{$arch_filename}#{$path}", "$dest_dir/{$dest_basename}");
}

$zip->close();

Testing

$ php script.php
d.png
a/b/c.png

$ find ./dest -type f
./dest/d.png
./dest/c.png

So the code is correct, and the issue must be somewhere else.

Ruslan Osmanov
  • 20,486
  • 7
  • 46
  • 60
0

Based upon file extension ( not necessarily the most reliable method ) you might find the following helpful.

/* source zip file and target location for extracted files */
$file='c:/temp2/experimental.zip';
$destination='c:/temp2/extracted/';

/* Image file extensions to allow */
$exts=array('jpg','jpeg','png','gif','JPG','JPEG','PNG','GIF');
$files=array();

/* create the ZipArchive object */
$zip = new ZipArchive();
$status = $zip->open( $file, ZIPARCHIVE::FL_COMPRESSED );


if( $status  ){

    /* how many files are in the archive */
    $count = $zip->numFiles;

    for( $i=0; $i < $count; $i++ ){
        try{

            $name = $zip->getNameIndex( $i );
            $ext = pathinfo( $name, PATHINFO_EXTENSION );
            $basename = pathinfo( $name, PATHINFO_BASENAME );

            /* store a reference to the file name for extraction or copy */
            if( in_array( $ext, $exts ) ) {
                $files[]=$name;

                /* To extract files and ignore directory structure */
                $res = copy( 'zip://'.$file.'#'.$name, $destination . $basename );
                echo ( $res ? 'Copied: '.$basename : 'unable to copy '.$basename ) . '<br />';
            }

        }catch( Exception $e ){
            echo $e->getMessage();
            continue;
        }
    }
    /* To extract files, with original directory structure, uncomment below */
    if( !empty( $files ) ){
        #$zip->extractTo( $destination, $files );
    }
    $zip->close();

} else {
    echo $zip->getStatusString();
}
Professor Abronsius
  • 33,063
  • 5
  • 32
  • 46
0

This will allow for you traverse all of the directories in a path and will search for anything that is an image/has the extensions that you have defined. Since you told the other use that you have the ziparchive portion done I have omitted that...

<?php

function traverse($path, $images = [])
{
    $files = array_diff(scandir($path), ['.', '..']);

    foreach ($files as $file) {
        // check if the file is an image
        if (in_array(strtolower(pathinfo($file, PATHINFO_EXTENSION)), ['jpg', 'jpeg', 'png', 'gif'])) {
            $images[] = $file;
        }

        if (is_dir($path . '/' . $file)) {
            $images = traverse($path . '/' . $file, $images);
        }
    }

    return $images;
}

$images = traverse('/Users/kyle/Downloads');

You want to follow this process:

  1. Get all of the files in the current working directory
  2. If a file in the CWD is an image add it to the images array
  3. If a file in the CWD is a directory, recursively call the traverse function and looking for images in the directory
  4. In the new CWD look for images, if the file is a directory recurse, etc...

It is important to keep track of the current path so you're able to call is_dir on the file. Also you want to make sure not to search '.' or '..' or you will never hit the base recursion case/it will be infinite.

Also this will not keep the directory path for the image! If you want to do that you should do $image[] = $path . '/' . $file;. You may want to do that and then get all of the file contents wants the function finishes running. I wouldn't recommend sorting the contents in the $image array because it could use an absurd amount of memory.

kyle
  • 2,563
  • 6
  • 22
  • 37
0

First thing to follow a folder is to regard it - your code does not do this.

There are no folders in a ZIP (in fact, even in the file system a "folder" IS a file, just a special one). The file (data) has a name, maybe containing a path (most likely a relative one). If by "go in subdiectories" means, that you want the same relative folder structure of the zipped files in your file system, you must write code to create these folders. I think copy won't do that for you automatically.

I modified your code and added the creation of folders. Mind the config variables I had to add to make it runable, configure it to your environment. I also left all my debug output in it. Code works for me standalone on Windows 7, PHP 5.6

error_reporting(-1 );
ini_set('display_errors', 1);
$zip_path = './test/cgiwsour.zip';
$write_dir = './test'; // base path for output

$zip = new ZipArchive();
if (!$zip->open($zip_path))
    die('could not open zip file '.PHP_EOL);
$valid_extensions = ['cpp'];
$create_subfolders = true;

//extract files in zip
for ($i = 0; $i < $zip->numFiles; $i++) {
    $file_name = $zip->getNameIndex($i);var_dump($file_name, $i);
    $file_info = pathinfo($file_name);//print_r($file_info);
    //if ( substr( $file_name, -1 ) == '/' ) continue; // skip directories - need to improve
    if (isset($file_info['extension']) && in_array(strtolower($file_info['extension']), $valid_extensions)) {

        $tmp_dir = $write_dir;
        if ($create_subfolders) {
            $dir_parts = explode('/', $file_info['dirname']);
            print_r($dir_parts);
            foreach($dir_parts as $folder) {
                $tmp_dir = $tmp_dir . '/' . $folder;
                var_dump($tmp_dir);
                if (!file_exists($tmp_dir)) { 
                    $res = mkdir($tmp_dir);
                    var_dump($res);
                    echo 'created '.$tmp_dir.PHP_EOL;
                }
            }
        }
        else {
            $tmp_dir .= '/' . $file_info['dirname']; 
        }
        //extract only images

        $res = copy("zip://" . $zip_path . "#" . $file_name,  $tmp_dir . '/' . $file_info['basename']);
        echo 'match : '.$file_name.PHP_EOL;
        var_dump($res);
    }
}
$zip->close();

Noticeable is, that mkdir() calls may not work flawlessly on all systems due to access/rights restrictions.

Honk der Hase
  • 2,459
  • 1
  • 14
  • 26