0

I am trying set checkpointing for spark streaming application to Azure storage. I was using S3 and the code was working fine.

Here is the latest code of how I set checkpointing to Azure.

sc.hadoopConfiguration
      .set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
    sc.hadoopConfiguration
      .set(
        "fs.azure.account.key.[name].blob.core.windows.net",
        [key]
      )
    ssc.checkpoint(
      "https://[name].blob.core.windows.net/[blob]")

Here is the error message that I am getting when starting. Exception in thread "main" java.io.IOException: No FileSystem for scheme: https

MarkZ
  • 29
  • 9

1 Answers1

0

See here - it's for databricks but should still apply.

val df = spark.read.parquet("wasbs://<container-name>@<storage-account-name>.blob.core.windows.net/<directory-name>")

==> So, use wasbs instead of https

silent
  • 14,494
  • 4
  • 46
  • 86