0

I have thousands of images which I optimize weekly by running it over a cronjob. My Problem is that it search also optimized images which lower the CPU. How could I save the last scan / optimization and began to optimize Files and Folders after this Date?

My Code

find . -name '*.jpg' | xargs jpegoptim --strip-all
find . -name '*.jpg' | xargs jpegoptim --all-progressive
chmod -R 777 *
karadayi
  • 2,212
  • 2
  • 21
  • 36
  • `find | xargs` is actually unsafe unless using `-print0` on the `find` side and `-0` on the `xargs` side (unfortunately, these are both non-POSIX options, but fortunately, they're found in modern releases on both GNU and BSD sides of the world). `xargs` treats quotes and spaces in filenames as syntactic, so using the original code if your filenames aren't closely controlled could lead to, at minimum, content being missed. – Charles Duffy Aug 16 '16 at 01:22
  • 1
    BTW, `chmod 777` is **very** bad practice. Giving every user on your system -- which includes `nobody` -- write permission to a file means that software that the operating system is running as an untrusted user **because that software is considered to be handling potentially malicious data** can write to your content. `ssh` runs code early in the handshake process (before the remote system's credentials are authenticated) there, for instance. As a rule, `o+w` should *never* be done, and that's doubly true for `o+wx`. – Charles Duffy Aug 16 '16 at 01:31
  • (Now, granted, there's often other sandboxing in place -- `chroot` and the like -- but chroot on Linux is far easier to bypass than it should be -- if that weren't true, after all, there'd be no point to running untrusted components of sandboxed daemons as `nobody`). – Charles Duffy Aug 16 '16 at 01:35

1 Answers1

1

The easy thing to do is to touch a file to track the most recent processing time, and to tell find to limit itself to content newer than that file.

To keep the prior semantics, where we were running two separate passes, completing all invocations of jpegoptim in one mode before going on to the other:

#!/bin/bash

extra_args=( )
[[ -e last-scan ]] && extra_args=( -newer last-scan )

find . -name '*.jpg' "${extra_args[@]}" -exec jpegoptim --strip-all '{}' +
find . -name '*.jpg' "${extra_args[@]}" -exec jpegoptim --all-progressive '{}' + 
touch last-scan

As an alternative, consider:

#!/bin/bash

extra_args=( )
[[ -e last-scan ]] && extra_args=( -newer last-scan )

find . -name '*.jpg' "${extra_args[@]}" \
  -exec sh -c 'jpegoptim --strip-all "$@"; jpegoptim --all-progressive "$@"' _ '{}' +

touch last-scan

In this latter approach, we're doing only one find pass, and then passing each batch of files to a shell, which is responsible for running jpegoptim in each mode in turn for that batch.


Finally: if jpegoptim is safe for concurrent usage, you could do the following:

#!/bin/bash

extra_args=( )
[[ -e last-scan ]] && extra_args=( -newer last-scan )

find . -name '*.jpg' "${extra_args[@]}" \
  -exec jpegoptim --strip-all '{}' + \
  -exec jpegoptim --all-progressive '{}' + 
touch last-scan

Here, we have a single find pass directly starting both copies of jpegoptim; the risk here is that if jpegoptim --strip-all and jpegoptim --all-progressive can't safely operate on the same file at the same time, this may behave badly.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441