5

The R package arrow installed with install.packages('arrow') does not have lz4 support:

codec_is_available('lz4')
# [1] FALSE

The package version is:

packageVersion('arrow')
# [1] ‘0.17.1’

This is on Ubuntu 20.04.

How can I get an R arrow package with lz4 support?

James Hirschorn
  • 7,032
  • 5
  • 45
  • 53

2 Answers2

3

According to the docs, you can use export LIBARROW_MINIMAL=false when building from source to make a build which supports compression:

You can also install the R package from a git checkout:

git clone https://github.com/apache/arrow
cd arrow/r
R CMD INSTALL .

If you don't already have the Arrow C++ libraries on your system, when installing the R package from source, it will also download and build the Arrow C++ libraries for you. To speed installation up, you can set

export LIBARROW_BINARY=true

to look for C++ binaries prebuilt for your Linux distribution/version. Alternatively, you can set

export LIBARROW_MINIMAL=false

to build the Arrow libraries with optional features such as compression libraries enabled. This will increase the build time but provides many useful features. Prebuilt binaries are built with this flag enabled, so you get the full functionality by using them as well.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
  • 6
    I would add that, at least on Linux, `install_arrow` can reinstall arrow assuming `arrow` is already installed. Then `install_arrow(binary = FALSE, minimal = FALSE)` will rebuild the package from source (since `binary == FALSE`) with the optional dependencies. – James Hirschorn Jul 26 '20 at 16:22
  • Is `install_arrow` provided by the R arrow package? – Nick ODell Jul 26 '20 at 16:24
  • Yes, see the discussion just below the docs you quoted. – James Hirschorn Jul 26 '20 at 16:25
  • 1
    `install_arrow(binary = FALSE, minimal = FALSE)` rebuilt the package but did not fix this issue for me on Ubuntu 18.04. – Fons MA Feb 14 '21 at 20:42
  • 1
    in fact, `codec_is_available` reports `FALSE` for both LZ4 and ZSTD.... so that's pretty useless in working with `pyarrow` – Fons MA Feb 14 '21 at 21:58
  • Also doesn't work for my system (Ubuntu 20.04). I've tried different ways of installing/compiling arrow and it never built if with lz4 support (yes, libs are installed; yes, python can use lz4) – Paul Mar 15 '21 at 16:41
0

The answer from Nick ODell did not work for me, running ubuntu 18.04 in a docker container.

What worked: (1) First install libraries mentioned under the subheading Debian GNU/Linux and Ubuntu: here: https://arrow.apache.org/install/ (2) Install r-arrow normally

sudo apt update
sudo apt install -y -V ca-certificates lsb-release wget
wget https://apache.bintray.com/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-archive-keyring-latest-$(lsb_release --codename --short).deb
sudo apt install -y -V ./apache-arrow-archive-keyring-latest-$(lsb_release --codename --short).deb
sudo apt update
sudo apt install -y -V libarrow-dev # For C++
sudo apt install -y -V libarrow-glib-dev # For GLib (C)
sudo apt install -y -V libarrow-dataset-dev # For Arrow Dataset C++
sudo apt install -y -V libarrow-flight-dev # For Flight C++
# Notes for Plasma related packages:
#   * You need to enable "non-free" component on Debian GNU/Linux
#   * You need to enable "multiverse" component on Ubuntu
#   * You can use Plasma related packages only on amd64
sudo apt install -y -V libplasma-dev # For Plasma C++
sudo apt install -y -V libplasma-glib-dev # For Plasma GLib (C)
sudo apt install -y -V libgandiva-dev # For Gandiva C++
sudo apt install -y -V libgandiva-glib-dev # For Gandiva GLib (C)
sudo apt install -y -V libparquet-dev # For Apache Parquet C++
sudo apt install -y -V libparquet-glib-dev # For Apache Parquet GLib (C)

R -e "install.packages('arrow')"
user3357177
  • 355
  • 2
  • 9