I'm using AWS SageMaker connected to an EMR cluster via Livy, with a "normal" session(default session config) the connection is created, and spark context works fine. but when adding
spark.pyspark.python":"./ANACONDA/env_name/bin/python3",
"spark.yarn.dist.archives":"s3://<path>/env_name.tar.gz#ANACONDA"
The session is not created and an error is thrown:
Neither SparkSession nor HiveContext/SqlContext is available
If I remove the spark.pyspark.python line, it takes some time(because it is distributing the .tar.gz file to executors) but it works, session and spark context arre created(but I cannot use the environment in the .tar.gz), so I guess it has something to do with spark.pyspark.python
Given that context: I'm trying to debug what's happening and for that, I want to check the Livy logs, but I cannot find them, I know they should be in S3 https://aws.amazon.com/premiumsupport/knowledge-center/spark-driver-logs-emr-cluster/ but I cannot find them anywhere, can anyone guide me to the logs location? or any idea on how to debug the issue?