0

Perspectives


Actually I needs to configure two service files. One for Spark Master and another for Spark Slave (Worker) node. Please find the environment and service configuration as following:

Cofigurations


/opt/cli/spark-3.3.0-bin-hadoop3/etc/env


JAVA_HOME="/usr/lib/jvm/java-17-openjdk-amd64"
SPARK_HOME="/opt/cli/spark-3.3.0-bin-hadoop3"
PYSPARK_PYTHON="/usr/bin/python3"

/etc/systemd/system/spark-master.service


[Unit]
Description=Apache Spark Master
Wants=network-online.target
After=network-online.target

[Service]
User=spark
Group=spark
Type=forking

WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-master.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-master.sh
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-master.sh

[Install]
WantedBy=multi-user.target

/etc/systemd/system/spark-slave.service


[Unit]
Description=Apache Spark Slave
Wants=network-online.target
After=network-online.target

[Service]
User=spark
Group=spark
Type=forking

WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-slave.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-slave.sh spark://spark.cdn.chorke.org:7077
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-slave.sh

[Install]
WantedBy=multi-user.target

Outcome


It's started successfully but failed to stop successfully for some sorts of errors! Actually it's failed to stop Apache Spark Master or Slave using Systemd

Spark Master Stop Status


× spark-master.service - Apache Spark Master
     Loaded: loaded (/etc/systemd/system/spark-master.service; disabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-09-26 18:43:39 +08; 8s ago
       Docs: https://spark.apache.org/docs/3.3.0
    Process: 488887 ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-master.sh (code=exited, status=0/SUCCESS)
    Process: 489000 ExecStartPost=/bin/bash -c echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-master.pid (code=exited, status=0/SUCCESS)
    Process: 489484 ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-master.sh (code=exited, status=0/SUCCESS)
   Main PID: 488903 (code=exited, status=143)
        CPU: 4.813s

Spark Slave Stop Status


× spark-slave.service - Apache Spark Slave
     Loaded: loaded (/etc/systemd/system/spark-slave.service; disabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-09-26 18:38:22 +08; 15s ago
       Docs: https://spark.apache.org/docs/3.3.0
    Process: 489024 ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-slave.sh spark://ns12-pc04:7077 (code=exited, status=0/SUCCESS)
    Process: 489145 ExecStartPost=/bin/bash -c echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-slave.pid (code=exited, status=0/SUCCESS)
    Process: 489174 ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-slave.sh (code=exited, status=0/SUCCESS)
   Main PID: 489040 (code=exited, status=143)
        CPU: 4.306s

Expected Behavior


Your guide line would be appreciated to shutdown both Master & Slave node without any error.

Śhāhēēd
  • 1,812
  • 6
  • 23
  • 44
  • Following links might be helpful to pass `systemctl` exit unit test [Service Exit Code Unit Test Failed](https://stackoverflow.com/questions/45953678) & [Services remain in failed state after stopped with systemctl](https://serverfault.com/questions/695849/) – Śhāhēēd Oct 19 '22 at 08:21
  • ```[Service] SuccessExitStatus=143``` – Śhāhēēd Oct 19 '22 at 08:46

1 Answers1

0

Theoretical Solution


In this case you have to write your own script for manipulating the shutdown to force exit code 0 instead of 143. If you are idle enough like me then you can changeSuccessExitStatus from 0 to 143. By default systemd unit test looking forSuccessExitStatus code is 0. We need to change the default unit test behavior.

Practical Solution


/etc/systemd/system/spark-master.service


[Unit]
Description=Apache Spark Master
Wants=network-online.target
After=network-online.target

[Service]
User=spark
Group=spark
Type=forking
SuccessExitStatus=143

WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-master.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-master.sh
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-master.sh

[Install]
WantedBy=multi-user.target

/etc/systemd/system/spark-slave.service


[Unit]
Description=Apache Spark Slave
Wants=network-online.target
After=network-online.target

[Service]
User=spark
Group=spark
Type=forking
SuccessExitStatus=143

WorkingDirectory=/opt/cli/spark-3.3.0-bin-hadoop3/sbin
EnvironmentFile=/opt/cli/spark-3.3.0-bin-hadoop3/etc/env
ExecStartPost=/bin/bash -c "echo $MAINPID > /opt/cli/spark-3.3.0-bin-hadoop3/etc/spark-slave.pid"
ExecStart=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/start-slave.sh spark://spark.cdn.chorke.org:7077
ExecStop=/opt/cli/spark-3.3.0-bin-hadoop3/sbin/stop-slave.sh

[Install]
WantedBy=multi-user.target
Śhāhēēd
  • 1,812
  • 6
  • 23
  • 44