Spatial packages in R often depend on C libraries for their numerical computation. This presents a problem when installing R packages that depend on these libraries if the R engine is unable to install these libraries using default permissions. It appears that databricks clusters present such an obstacle for R. I guess there are two ways around this, 1) to create a docker container with the relevant scripts to install the packages or 2) to install them by way of an init script. I figured the latter approach would be easier but I'm having some problems. The clusters fail to start up bc my init script fails to execute. See below -I've also tried with sudo
set -euxo pipefail
apt install libgeos-dev
apt install libudunits2-dev
apt install libgdal-dev
Relatedly, should these only be installed on the driver node? I dont see a reason why they need to be on worker nodes. The above code installs it on workers and drivers I think. To install on just the driver I suppose it would be:
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
apt install libgeos-dev
apt install libudunits2-dev
apt install libgdal-dev