1

Previous

This is a follow-up to this question.

Specs

My system is a dedicated server running Ubuntu Desktop, Release 12.04 (precise) 64-bit, 3.14.32-xxxx-std-ipv6-64. Neither release or kernel can be upgraded, but I can install any package.

Problem

The problem discribed in the question above seems to be solved, however this doesn't work for me. I've installed the latest lftp and parallel packages and they seem to work fine for themselves.

  • Running lftp works fine.
  • Running ./job.sh ftp.microsoft.com works fine, but I needed to chmod -x the script
  • Running sed 's/|.*$//' end_unique.txt | xargs parallel -j20 ./job.sh ::: does not work and produces bash errors in the form of /bin/bash: <server>: command not found.

To simplify things, I cleaned the input file end_unique.txt, now it has the following format for each line:

<server>

Each line ends in a CRLF, because it is imported from a windows server.

Edit 1:

This is the job.sh script:

#/bin/sh
server="$1"
lftp -e "find .; exit" "$server" >"$server-files.txt"

Edit 2:

I took the file and ran it against fromdos. Now it should be standard unix format, one server per line. Keep in mind that the server in the file can vary in format:

ftp.server.com
www.server.com
server.com
123.456.789.190

etc. All of those servers are ftp servers, accessible by ftp://<serverfromfile>/.

Community
  • 1
  • 1
turbo
  • 1,233
  • 14
  • 36
  • Please make this question self-contained by included the definition of `jobs.sh`. Also, if you removed the `|` from the input file, then the sed command will not do anything; in particular, it won't remove the CR characters, which are likely to cause problems later on. – rici Jun 13 '15 at 22:07
  • Fix the import; if you're using FTP, set file mode ASCII before importing. Otherwise, it is sensible to remove the carriage returns from the CRLF endings before using the imported file by any of the many options available (`tr`, `dos2unix`, …). – Jonathan Leffler Jun 13 '15 at 22:12

1 Answers1

2

With :::, parallel expects the list of arguments it needs to complete the commands it's going to run to appear on the command line, as in

parallel -j20 ./job.sh ::: server1 server2 server3

Without ::: it reads the arguments from stdin, which serves us better in this case. You can simply say

parallel -j20 ./job.sh < end_unique.txt

Addendum: Things that can go wrong

Make certain two things:

  1. That you are using GNU parallel and not another version (such as the one from moreutils), because only (as far as I'm aware) the GNU version supports reading an argument list from stdin, and
  2. That GNU parallel is not configured to disable the GNU extensions. It turned out, after a lengthy discussion in the comments, that they are disabled by default on Ubuntu 12.04, so it is not inconceivable that this sort of thing might be found elsewhere (particularly downstream from Ubuntu). Such a configuration can hide in

    • The environment variable $PARALLEL,
    • /etc/parallel/config, or
    • ~/.parallel/config

If the GNU version of parallel is not available to you, and if your argument list is not too long for the shell and none of the arguments in it contain whitespaces, the same thing with the moreutils parallel is

parallel -j20 job.sh -- $(cat end_unique.txt)

This did not work for OP because the file contained more servers than the shell was willing to put into a command line, but it might work for others with similar problems.

Wintermute
  • 42,983
  • 5
  • 77
  • 80
  • Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackoverflow.com/rooms/80828/discussion-on-answer-by-wintermute-bash-unexpected-parallel-behavior-when-readi). – Martijn Pieters Jun 17 '15 at 20:33