4

I use an HPC cluster. The compute nodes can't have access to internet, only the frontal.

So I want to wrap all the commands that need to access internet in order to execute them on the frontal.

ex: for wget

#!/bin/bash
ssh frontal /bin/wget "$@"

-> works fine

I have to wrap this bq (google BigQuery) command: bq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '2016%' AND mgrs_tile == '32ULU' ORDER BY sensing_time ASC LIMIT 1000;"

I managed to requote the command and to launch it successfully on CLI: ssh frontal '~/downloads_and_builds/builds/google-cloud-sdk/bin/bq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '"'"'2016%'"'"' AND mgrs_tile == '"'"'32ULU'"'"' ORDER BY sensing_time ASC LIMIT 1000;"'

Now I want to write a wrapper named bq able to get the parameters and launch this command through ssh ... here is what i have tried :

#!/bin/bash
set -eu

# all parameters in an array
args=("$@")

# unset globing (there's a * in the SELECT clause)
set -f

# managing inner quotes
arg2=`echo "${args[2]}" | perl -pe 's/'\''/'\''"'\''"'\''/g'`

# put back double quotes (") suppressed by bash
args="${args[0]} ${args[1]} \"${arg2}\""

# build command with parameters
cmd="~/downloads_and_builds/builds/google-cloud-sdk/bin/bq $args"

echo ""
echo "command without external quotes"
echo "$cmd"
echo ""

echo "testing it ..."
ssh hpc-login1 "$cmd"
echo ""

# wrapping command between simple quotes (like on the CLI)
cmd="'"'~/downloads_and_builds/builds/google-cloud-sdk/bin/bq '"$args""'"
echo "commande with external quotes"
echo "$cmd"
echo ""

echo "testing it ..."
ssh hpc-login1 $cmd
echo "done"

Here is the output of this script: $ bq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '2016%' AND mgrs_tile == '32ULU' ORDER BY sensing_time ASC LIMIT 1000;"

command without external quotes
~/downloads_and_builds/builds/google-cloud-sdk/bin/bq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '"'"'2016%'"'"' AND mgrs_tile == '"'"'32ULU'"'"' ORDER BY sensing_time ASC LIMIT 1000;"

testing it ...
Waiting on bqjob_r102b0c22cdd77c2d_000001629b8391a3_1 ... (0s) Current status: DONE   

commande with external quotes
'~/downloads_and_builds/builds/google-cloud-sdk/bin/bq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '"'"'2016%'"'"' AND mgrs_tile == '"'"'32ULU'"'"' ORDER BY sensing_time ASC LIMIT 1000;"'

testing it ...
bash: ~/downloads_and_builds/builds/google-cloud-sdk/bin/bq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '2016%' AND mgrs_tile == '32ULU' ORDER BY sensing_time ASC LIMIT 1000;": Aucun fichier ou dossier de ce type (in english: no file or directory of this kind)

As you can see, I managed to get a correct command string, just like the one which works on CLI, but it doesn't work in my script:

  1. The first attempt succeeded but gives no output (I have tried to redirect it in a file: the file were created but is empty)
  2. In the second attempt (with external simple quotes, just like the CLI command that worked), bash take the quoted arg as a block and don't find the command ...

Has somebody an idea on how to launch a complex command (with quotes, wildcards ...) like this one through ssh using a wrapper script ?

(ie. one wrapper named foo able to replace a foo command and execute it correctly through ssh with the arguments provided)

DuGNu
  • 81
  • 9
  • 1
    Is reading the query from a file not an option? `bq query < query_text.sql`. It would be better not to mess with escaping at all. – Elliott Brossard Apr 06 '18 at 17:35

3 Answers3

4

ssh has the same semantics as eval: all arguments are concatenated with spaces and then evaluated as a shell command.

You can have it work with execve semantics (like sudo) by having a wrapper escape the arguments:

remotebq() { 
  ssh yourhost "~/downloads_and_builds/builds/google-cloud-sdk/bin/bq $(printf '%q ' "$@")"
}

This quotes thoroughly and consistently, so you no longer have to worry about adding additional escaping. It'll run exactly what you tell it (as long as your remote shell is bash):

remotebq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '2016%' AND mgrs_tile == '32ULU' ORDER BY sensing_time ASC LIMIT 1000;"

However, the downside to running exactly what you tell it is that now you need to know exactly what you want to run.

For example, you can no longer pass '~/foo' as an argument because this is not a valid file: ~ is a shell feature and not a directory name, and when it's correctly escaped it will not be replaced by your home directory.

that other guy
  • 116,971
  • 11
  • 170
  • 194
  • Excellent ! It solves completely my problem ! printf '%q' ... I didn't know it. That's the trick I wanted to know for years, thanks a lot ! And thanks too for advices about ~ – DuGNu Apr 09 '18 at 09:10
  • @DuGNu please accept this answer if it helped you solved your problem. – VictorGGl Apr 23 '18 at 10:44
2

The basic way to do this, using shell here doc :

#!/bin/bash

ssh -t server<<'EOF'
bq --format=json query "SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '2016%' AND mgrs_tile == '32ULU' ORDER BY sensing_time ASC LIMIT 1000;"
command2
command3
...
EOF
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • Yes, it could work with the complete path for bq (the wrapper is first in the PATH), but it doen't fix the problem as what you suggest is hard-coded. The request is not the same each time. And you can not put a var in the heredoc section as it is not evaluated – DuGNu Apr 06 '18 at 15:49
  • 2
    I use another account, asked to be linked both, still waiting – Gilles Quénot Apr 06 '18 at 16:14
  • 1
    @DuGNu If you don't quote the here doc token (i.e. use `EOF` instead of `'EOF'`, then variables will be expanded on the client side. You need to take care to escape any expansions that you want to happen on the server side. – that other guy Apr 06 '18 at 17:05
0

I see you are already using Perl so...

use Net::OpenSSH;

my $query = q(SELECT * FROM [bigquery-public-data:cloud_storage_geo_index.sentinel_2_index] WHERE sensing_time LIKE '2016%' AND mgrs_tile == '32ULU' ORDER BY sensing_time ASC LIMIT 1000;);

my $ssh = Net::OpenSSH->new($host);
$ssh->system('bq', '--format=json', 'query', $query)
  or die $ssh->error;

Net::OpenSSH would take care of quoting everything.

salva
  • 9,943
  • 4
  • 29
  • 57