2

I'm using ack as part of a bash script to build a list of MP4 files from a mysqldump. The dump file is about 15mb.

Here's my line:

ack -o "https??://cdn.host.com\S+?\.mp4" /home/me/.dump.sql > /home/me/.mp4-matches.txt

It works fine when running bash script by hand. We get this in .mp4-matches.txt:

    https://cdn.host.com/url/t_Foo/Foo.mp4
    https://cdn.host.com/url/t_Bar/Bar.mp4

But when running the very same command by itself as a cronjob, it produces an empty file.

I can't figure out why it's not working in cron.

I've tried fiddling with PATH, SHELL, etc in the crontab to try ensure environment is the same as running it by hand. Nothing made a difference.

I've tried using all hard paths in crontab to ack /usr/bin/ack just to be sure. Didn't make a difference.

I've tried using bash -l to start the script. Didn't make a difference.

What am I doing wrong?

Edits for further info:

  • I am running Debian 7 but I also tested on Ubuntu 18.04 and same issue appears there too.

  • Server is running exim4, all working. Nothing is being sent to the MAILTO address about this line in the cron.

  • There's nothing in /var/log/syslog or /var/log/messages concerning errors from cron

  • Using bash #!/bin/bash

  • There are no percent signs (%) anywhere in the crontab.

  • Crontab looks like this:

    MAILTO=myemail@address.com
    BASH=/bin/bash
    SHELL=/bin/bash
    PATH=/usr/local/bin:/usr/bin:/bin
    
    27 9 * * * /usr/bin/ack -o "https??://cdn.host.com\S+?\.mp4" /home/me/.dump.sql > /home/me/.mp4-matches.txt
    
  • Yes, it's my crontab, not root.

  • There is no .ackrc file being read. It does not exist anywhere on my system. find / -iname ".ackrc" returned nothing.

  • A simple expression "https://cdn.host.com/url/t_Foo/Foo.mp4" still returns an empty file in cron.

  • Both single quotes and double quotes produce empty file in cron.

  • STERR seems to provide nothing also. Running /usr/bin/ack -o 'https://cdn.host.com/url/t_Foo/Foo.mp4' /home/me/.dump.sql > /home/me/.mp4-matches.txt 2> /home/me/.ackerror.txt returned both .mp4-matches.txt and .ackerror.txt files as 0 bytes.

nooblag
  • 678
  • 3
  • 23
  • I'd first try to see if the cronjob produces output on an ack search of a simple expression, without the special characters. Which shell does cron run this in btw.? Also, while debugging this, of course leave all the paths explicitly as full paths. –  Apr 20 '19 at 22:44
  • I presume you don't have any percent signs (`%`) in the crontab... – Mark Setchell Apr 20 '19 at 22:47
  • Do you have an ~/.ackrc file that isn't read by the cron job, but that is read when you perform the cli command? –  Apr 20 '19 at 23:02
  • Try having the ack regex in single quotes, not double quotes. –  Apr 20 '19 at 23:08
  • And finally, you might want to check out this elegant piece of advice: https://raymii.org/s/tutorials/Better_cron_env_and_shell_control_with_the_SHELL_variale.html –  Apr 20 '19 at 23:16
  • Did you put it in your own crontab, or in root's crontab? Did you use tilde (`~`) in place of `/home/me`? Where is the cronjob time specified? – Mark Setchell Apr 20 '19 at 23:20
  • 2
    Have you checked stderr from that command? I think it shows up in the cron log, or you could pipe stderr to a file (`2> filename`) – Anish Goyal Apr 20 '19 at 23:21
  • I find it a bit peculiar you did `find / -iname ".ackrc"` whereas you could have sufficed with `find $HOME -iname ".ackrc"`. Anyway... –  Apr 20 '19 at 23:21
  • Try changing the redirection to `>| .mp4-matches.txt` (i.e. overwrite file if existing). –  Apr 20 '19 at 23:31
  • I think mailto needs to be a local user (i.e. simply just your username), unless you have a full fledged mail server running. So just something like `MAILTO=`. On second thought, I think I'm prob. mistaken on this. –  Apr 20 '19 at 23:45
  • To clarify: the local user prob. works for MAILTO w/o a mail server running, for cron will in all likelihood just place a mail-file in /var/spool/mail/ –  Apr 20 '19 at 23:56
  • You can add `2>&1` to the end of your cron line to redirect the error output to the log file as well. Then, look at or post the log file. Right now, the error is probably in the email. – Keldorn Apr 21 '19 at 00:01
  • A comment on a debugging approach: when trying to get to the bottom of something like this, I usually would take the command line out of the crontab and put it in a script in my homedir, let's say doit.sh (chmod 700, etc.), and cleanly just call that with full path in the crontab. Oftentimes it's easier that way to add error reporting to that script, make it create run files (e.g. "~/.doit-ran-ok), etc. So in general a bit of separation. –  Apr 21 '19 at 00:03
  • @roadowl `>|` is not valid `sh` syntax and is thus unlikely to work in `cron` (at least not portably). The standard POSIX redirection to join stdout and stderr to the same file is `>filename 2>&1` – tripleee Apr 21 '19 at 07:25
  • You say 'Using bash #!/bin/bash', but do you have 'SHELL=/bin/bash' somewhere near the top in your crontab? –  Apr 21 '19 at 13:56
  • For now I think your best bet is to use `grep -o` instead of ack. That works for me. I couldn't get ack to work. –  Apr 21 '19 at 14:50
  • Thanks for that. I did start out using `grep`, but it wasn't working cleanly. Seemed to have troubles matching lines ending with `,')` in the mysqldump and so returning huge blobs of junk in the `.mp4-matches.txt`. Switched to `ack` and it was working very well all the way up until running the script as a cronjob. – nooblag Apr 21 '19 at 18:18

1 Answers1

1
  1. This should work:

    * * * * * ack -o "https??://cdn.host.com\S+?\.mp4" /home/me/.dump.sql </dev/null > /home/me/.mp4-matches.txt
    
  2. Also you probably can use your line with the the --nofilter option

tripleee
  • 175,061
  • 34
  • 275
  • 318
wuseman
  • 1,259
  • 12
  • 20
  • 1. Worked! Any idea why?? – nooblag Apr 21 '19 at 08:19
  • 1
    --[no]filter Force ack to treat standard input as a pipe (--filter) or tty (--nofilter). It has always been so i dont know why it is like this, there is some info under acks issues on github regard this – wuseman Apr 21 '19 at 16:27