0

I am trying to parametrise the confluent connector script - where I have externalised the variables that I am reading from a file and now I am trying to externalise the data part.

I am writing a wrapper shell script where I read the parameter file which has a format of var=value

How do I set-up my data file where I can use the variable substitution.

Is it possible to do it that way? How can I achieve it?

My current curl command is

curl --cert /clientcertslocation/certificate.pem --key /clientcertslocation/priv.key -k -X PUT -H "${HEADER}" --data @data.json "${CONNECT_SERVER_REST_API_PROTOCOL}://${CONNECT_SERVER}":8083/connectors/${CONNECTOR_NAME}/config

And the data.json looks like this

{
  "connector.class": "io.confluent.connect.s3.S3SinkConnector",
  "errors.log.enable": "true",
  "errors.log.include.messages": "false",
  "errors.tolerance": "all",
  "flush.size": "1",
  "locale": "en-US",
  "name": "${CONNECTOR_NAME}",
  "partition.duration.ms": "3600000",
  "partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
  "path.format": "'event_creation_time='YYYY-MM-dd",
  "s3.region": "us-east-1",
  "s3.bucket.name": "${S3_BUCKET_NAME}",
  "s3.part.size": "76350000",
  "rotate.interval.ms":"90000",
  "schema.compatibility": "NONE",
  "schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
  "schema.registry.url": "${SR_URL}",
  "storage.class": "io.confluent.connect.s3.storage.S3Storage",
  "tasks.max": "${NO_OF_TASKS}",
  "timestamp.extractor": "RecordField",
  "timestamp.field": "EventCreatedTime",
  "timezone": "UTC",
  "topics": "${topics_list}",
  "topics.dir": "${SOURCE_SYSTEM_NAME}",
  "format.class": "io.confluent.connect.s3.format.avro.AvroFormat",
  "key.converter": "org.apache.kafka.connect.storage.StringConverter",
  "value.converter": "io.confluent.connect.avro.AvroConverter",
  "value.converter.schema.registry.url": "${SR_URL}"
}

Thank you

adbdkb
  • 1,897
  • 6
  • 37
  • 66
  • you should `jq` for JSON generation from the shell – Fravadona Jul 29 '22 at 08:15
  • Didn't get your point. I have the JSON format, I just want to update the variable values in the shell script before executing the curl command – adbdkb Jul 29 '22 at 09:44
  • I mean that expanding shell variables inside a JSON can break it (for example when a variable contains a `"`); that's why it's almost mandatory to use JSON aware tools that will take care of the correct escaping – Fravadona Jul 29 '22 at 09:50
  • So, in the below example how can I do it using jq with this construct - eval "cat < – adbdkb Jul 29 '22 at 21:20
  • for example with `jq --arg var1 "$CONNECTOR_NAME" --arg var2 "$SR_URL" '.name = $var1 | ."schema.registry.url" = $var2' data.json` – Fravadona Jul 29 '22 at 22:40

1 Answers1

0

Running the data through a here-document will expand variables whilst preserving quotes:

#!/bin/sh

. parameter-file

eval "cat <<EOF
$(cat data.json)
EOF" |

curl \
--cert /clientcertslocation/certificate.pem \
--key /clientcertslocation/priv.key \
-k \
-X PUT \
-H "${HEADER}" \
--data @- \
"${CONNECT_SERVER_REST_API_PROTOCOL}://${CONNECT_SERVER}":8083/connectors/${CONNECTOR_NAME}/config

--data @- tells curl to read the data from stdin.

To avoid the possible security problems of eval, remove the need for indirection by hardcoding the data directly in the wrapper script:

curl ... --data @- <<EOF
{
    "connector.class": "io.confluent.connect.s3.S3SinkConnector",
...
    "value.converter.schema.registry.url": "${SR_URL}"
}
EOF
dan
  • 4,846
  • 6
  • 15
  • The current setup is such that it has the data hardcoded, but we have to create many connectors with minor changes, only to bucket name or topic name, hence my attempt to externalize that part – adbdkb Jul 29 '22 at 09:42