4

we would like to install Spark-Alchemy to use it within Pyspark inside foundry (we would like to use their hyperloglog functions). While I know how to install a pip package, I am not sure what it is needed to install this kind of package.

Any help or alternative solutions related to the use of hyperloglog with pyspark will be appreciated, thanks!

fmsf
  • 36,317
  • 49
  • 147
  • 195
pietro
  • 91
  • 3

2 Answers2

1

PySpark Transform repositories in Foundry are connected to conda. You can use the coda_recipe/meta.yml to pull packages into your transforms. If a package you want is not available in your channels, I would recommend you reach out to your administrators to ask if it's possible to add it. Adding a custom jar that extends spark is something that needs to be reviewed by your platform administrators since it can represent a security risk.

I did a $ conda search spark-alchemy and couldn't find anything related and reading through these instructions https://github.com/swoop-inc/spark-alchemy/wiki/Spark-HyperLogLog-Functions#python-interoperability it makes me guess that there isn't a conda package available.

fmsf
  • 36,317
  • 49
  • 147
  • 195
0

I can't comment about the use of this specific library but in general, Foundry support Conda channels and if you have a Conda repo and configure foundry to connect to that channel you can add this library or others and reference them in your code.

Eran Witkon
  • 4,042
  • 4
  • 19
  • 20