1

We have some issues deploying our GCP Dataflow pipeline. After some analysis, found that the latest version of apache-beam has some issues while installing. To replicate the issue I created a virtualenv and ran the below

pip install apache-beam==2.32.0

Below errors started to pop while installing the 'orjson' dependency,

  Using cached orjson-3.6.3.tar.gz (548 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... error
    ERROR: Command errored out with exit status 1:
     command: 'c:\temp\virtu\scripts\python.exe' 'c:\temp\virtu\lib\site-packages\pip\_vendor\pep517\in_process\_in_process.py' prepare_metadata_for_build_wheel 'C:\Users\prasasaw\AppData\Local\Temp\tmpmekx1jjj'
         cwd: C:\Users\prasasaw\AppData\Local\Temp\pip-install-0yofoe55\orjson_62f1ca2674934a7f8c45b08e87e05a4b
    Complete output (6 lines):

    Cargo, the Rust package manager, is not installed or is not on PATH.
    This package requires Rust and Cargo to compile extensions. Install it through
    the system's package manager or via https://rustup.rs/

Note that the previous versions of Apache beam like 2.30.0 does not have the dependency on 'orjson' and it works just fine when you do

pip install apache-beam==2.30.0

I tried to install RUST but it failed for some 'pysam' dependency. So would like to know what is the correct way to install the 'orjson' dependency

I saw this GitHub issue for orjson installation but could not find much from it. https://github.com/readthedocs/readthedocs.org/issues/7687

//Prasad.

Prasad Sawant
  • 205
  • 1
  • 15
  • What Python version are you currently using? – p13rr0m Sep 23 '21 at 19:00
  • If you can go with Python version 3.7 or 3.8. – p13rr0m Sep 23 '21 at 19:13
  • python --version -> 3.7.4 – Prasad Sawant Sep 24 '21 at 07:06
  • Okay, my assumption was that you were still using a Python version below 3.7. That would explain why pip is trying to build orjson from source. Using one of the supported python versions would have allowed you to install the package without building it first (list of pre-build wheels https://pypi.org/project/orjson/#files and python version). I can only give you some more hints that hopefully solves your issues. 1. Try installing with `pip install --no-cache-dir apache-beam==2.32.0` 2. Try installing orjson directly with `pip install orjson==3.6.3` in a clean virtual env – p13rr0m Sep 24 '21 at 09:33
  • I tried both these commands on a clean virtual env, both still get the same error. – Prasad Sawant Sep 27 '21 at 09:29

3 Answers3

1

I'm a bit late to the party, but I ran into this issue as well today. I solved it by switching to a 64-bit Python environment (I accidently installed 32-bit).

CaptainNabla
  • 563
  • 1
  • 4
  • 12
0

The orjson dependency was introduced in https://github.com/apache/beam/pull/14690/files. According to the comment: orjson, only available on Python 3.6 and above. You may want to check your python version.

Andy Xu
  • 101
  • 3
0

According to the orjson docs, you need to upgrade pip to be above 20.3:

pip install --upgrade "pip>=20.3" # manylinux_x_y, universal2 wheel support
pip install --upgrade orjson

(docs)