I am creating my spark-shell using the below command.
spark-shell --packages org.apache.hadoop:hadoop-aws:3.1.1,com.amazonaws:aws-java-sdk-pom:1.11.392,org.wso2.orbit.joda-time:joda-time:2.9.4.wso2v1
Then I am running the below code to access a file in S3.
val accessKeyId = "myid"
val secretAccessKey = "mykey"
sc.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", accessKeyId)
sc.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey",secretAccessKey)
sc.hadoopConfiguration.set("fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem")
val lines = sc.textFile("s3a://bucket-name/path-to-file")
Now runnning the below code gives me the below error.
scala> lines.count()
java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StreamCapabilities
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)