3

My goal is to setup configmap and then use the config file in the spark application. Here are the details:

I have a config file (test_config.cfg) that looks like this:

[test_tracker]
url = http://localhost:8080/testsomething/
username = TEST
password = SECRET

I created the config map by running the following command:

kubectl create configmap testcfg1 --from-file test_config.cfg

Now, I have a YAML file(testprog.yaml) with SparkApplication specs that look like this:

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: testprog
  namespace: default
spec:
  type: Python
  pythonVersion: "3"
  mode: cluster
  image: "<ip-url>:5000/schemamatcher/schemamatcher-spark-py:latest"
  imagePullPolicy: Always
  mainApplicationFile: local:///opt/spark/dependencies/testprog.py
  arguments: ['s3a://f1.parquet', 's3a://f2.parquet', '--tokenizer-type', 'param']
  sparkVersion: "3.0.0"
  restartPolicy:
    type: OnFailure
    onFailureRetries: 3
    onFailureRetryInterval: 10
    onSubmissionFailureRetries: 5
    onSubmissionFailureRetryInterval: 20
  driver:
    cores: 1
    coreLimit: "1200m"
    memory: "16g"
    labels:
      version: 3.0.0
    serviceAccount: default
    configMaps:
      - name: testcfg1
        path: /mnt/config-maps
  executor:
    cores: 1
    instances: 2
    memory: "20g"
    labels:
      version: 3.0.0
  hadoopConf:
    "fs.s3a.access.key": minio
    "fs.s3a.secret.key": minio123
    "fs.s3a.endpoint": http://<ip-url>:9000

Now, I am able to run the program using:

kubectl apply -f testprog.yaml

the pod just runs fine and doesn't throw any error. But I am unable to see my config file at the path given and I don't understand why. When the pod is executing I do:

kubectl exec --stdin --tty test-driver -- /bin/bash

and I try to look for the config file in the path /mnt/config-maps I don't see anything. I tried a couple of things but no luck. Besides, some of the documentation says that mutation webhook should be setup and I think the previous guy did it but I am not sure how to check it (but I think it is there).

Any help would be great as I am new and I am still learning about k8s.

Update: Have also tried to update the specs like this and run and still no luck.

  volumes:
    - name: config
      configMap:
        name: testcfg1
  driver:
    cores: 1
    coreLimit: "1200m"
    memory: "16g"
    labels:
      version: 3.0.0
    serviceAccount: default
    volumeMounts:
      - name: config
        mountPath: /opt/spark
  executor:
    cores: 1
    instances: 2
    memory: "20g"
    labels:
      version: 3.0.0
    volumeMounts:
      - name: config
        mountPath: /opt/spark
yguw
  • 856
  • 6
  • 12
  • 32
  • @kamol-hasan can you please help me with this. I was trying to follow your answer https://stackoverflow.com/questions/64274200/azure-kubernetes-python-to-read-configmap on stackoverflow but no luck :(. – yguw Mar 03 '21 at 04:28

2 Answers2

2

Not sure if this issue was solved in Spark v3.0.0 (that you seem to be using), but there was a bug in Spark on Kubernetes that was preventing ConfigMaps from mounting properly. See this discussion: https://stackoverflow.com/a/58508313/8570169

yoda_droid
  • 361
  • 1
  • 7
0

Try this:

kubectl apply -f manifest/spark-operator-with-webhook.yaml

This will enable mutating admission webhooks. This will create a deployment named sparkoperator and a service named spark-webhook for the webhook in namespace spark-operator.

msoler
  • 2,930
  • 2
  • 18
  • 30
  • I tried this and it's already enabled and it doesn't solve my problem. – yguw Mar 03 '21 at 18:57
  • Is configmap mounting working for spark operator-1.3.1- 3.1.1? I still have problem unable to upgrade the operator for my k8s 1.19. – anand babu Jan 06 '22 at 04:16