I am using Spark on a Google Cloud Dataproc cluster and I would like to write in Bigtable in a PySpark job. As google connector for the same is not available, I am simply using google cloud bigtable client to insert the data and use spark for parallelism. I am not able to bundle google-cloud-python package so that its accessible on the dataproc cluster. I have download the wheel (whl) for google-cloud-bigtable and converted it to egg. Still its not working.
Is there any example of using google python clinet in pyspark job? Also it would be really helpful to know how it can be made available on cluster.