2

Does anyone know how to install zookeeper's latest version on dataproc as zookeeper quorum mode. while cluster creation.

On latest dataproc version 2.0 (Debian 10, Hadoop 3.2, Spark 3.1)

There are two ways to install zookeeper on dataproc one with initialization-actions scripts and another way is just by selecting as optional-components ZOOKEEPER while cluster creation.

The problem with both ways I m facing is both don't install the latest version 3.6.3 instead they install version 3.4. For my use-case I need the latest with zookeeper quorum mode.

One weird thing which I notice when I use optional-components ZOOKEEPER it doesn't form zookeeper quorum instead it installs as a standalone mode in master and all slave nodes but while using initialization action scripts on cluster creation it does form zookeeper quorum but still it's 3.4. only and not the latest version 3.6.3 Would appreciate any help

TBA
  • 1,921
  • 4
  • 13
  • 26
kashif
  • 41
  • 2

1 Answers1

1

There are several options:

  1. Create a custom image and uninstall the default ZooKeeper in the image, and install your own.

  2. Ignore the default ZooKeeper, don't select the optional component, instead use an init action similar to this one to install and start your own ZooKeeper service.

To create your own ZooKeeper service, check the default ZooKeeper unit file, and modify accordingly:

$ systemctl cat zookeeper-server.service

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/init.d/zookeeper-server
Description=LSB: ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Before=multi-user.target
Before=multi-user.target
Before=multi-user.target
Before=graphical.target
After=network-online.target
After=nss-lookup.target
Wants=network-online.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=no
PIDFile=/var/run/zookeeper/zookeeper_server.pid
SuccessExitStatus=5 6
ExecStart=/etc/init.d/zookeeper-server start
ExecStop=/etc/init.d/zookeeper-server stop

# /etc/systemd/system/zookeeper-server.service.d/restart.conf
[Unit]
StartLimitIntervalSec=0

[Service]
Restart=on-failure

Put your own unit file at e.g., /usr/lib/systemd/system/zookeeper-server.service, then run sudo systemctl daemon-reload and sudo systemctl start zookeeper-server.

Dagang
  • 24,586
  • 26
  • 88
  • 133
  • Yes Dagang i created my own init action scripts for ZooKeeper its working well on dataproc but my ZooKeeper is in dir `/opt/ZooKeeper` any idea how to create System Service File i try few ways but aren't working OS is Ubuntu 18 LTS – kashif Dec 14 '21 at 05:34
  • 1
    I updated the answer with systemd related info. – Dagang Dec 14 '21 at 17:46