1

Background

  • I was planning to use S3 to store the Flink's checkpoints using the FsStateBackend. But somehow I was getting the following error.

Error

org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 's3'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.

Flink version: I am using Flink 1.10.0 version.

Keshav Lodhi
  • 2,641
  • 2
  • 17
  • 23

1 Answers1

10

I have found the solution for the above issue, so here I am listing it in steps that are required.

Steps

  1. We need to add some configs in the flink-conf.yaml file which I have listed below.
state.backend: filesystem
state.checkpoints.dir: s3://s3-bucket/checkpoints/ #"s3://<your-bucket>/<endpoint>"
state.backend.fs.checkpointdir: s3://s3-bucket/checkpoints/ #"s3://<your-bucket>/<endpoint>"


s3.access-key: XXXXXXXXXXXXXXXXXXX #your-access-key
s3.secret-key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx #your-secret-key

s3.endpoint: http://127.0.0.1:9000 #your-endpoint-hostname (I have used Minio) 
  1. After completing the first step we need to copy the respective(flink-s3-fs-hadoop-1.10.0.jar and flink-s3-fs-presto-1.10.0.jar) JAR files from the opt directory to the plugins directory of your Flink.

    • E.g:--> 1. Copy /flink-1.10.0/opt/flink-s3-fs-hadoop-1.10.0.jar to /flink-1.10.0/plugins/s3-fs-hadoop/flink-s3-fs-hadoop-1.10.0.jar // Recommended for StreamingFileSink
      2. Copy /flink-1.10.0/opt/flink-s3-fs-presto-1.10.0.jar to /flink-1.10.0/plugins/s3-fs-presto/flink-s3-fs-presto-1.10.0.jar //Recommended for checkpointing
  2. Add this in checkpointing code

env.setStateBackend(new FsStateBackend("s3://s3-bucket/checkpoints/"))
  1. After completing all the above steps re-start the Flink if it is already running.

Note:

  • If you are using both(flink-s3-fs-hadoop and flink-s3-fs-presto) in Flink then please use s3p:// specificly for flink-s3-fs-presto and s3a:// for flink-s3-fs-hadoop instead of s3://.
  • For more details click here.
Keshav Lodhi
  • 2,641
  • 2
  • 17
  • 23
  • 5
    One more thing: it is recommended to use `flink-s3-fs-presto` for checkpointing, and not `flink-s3-fs-hadoop`. The hadoop S3 tries to imitate a real filesystem on top of S3, and as a consequence, it has high latency when creating files and it hits request rate limits quickly. This is because before writing a key, it checks to see if the "parent directory" exists, which can involve a bunch of expensive S3 HEAD requests (which have very low request rate limits). – David Anderson Oct 06 '20 at 17:15
  • 2
    Also, with Hadoop S3 you may come to a situation where you fail restore operations because it looks like a state file is not there (HEAD request leading to false caching in a S3 load balancer). Only after a while will the file be visible and only then will the restore succeed. – David Anderson Oct 06 '20 at 17:17
  • 1
    So i encountered with the same issue as you and did the steps you recommended but i got weird error message saying "Caused by: java.lang.IllegalArgumentException: Cannot use the root directory for checkpoints" did you had it as well ? – Shalom Balulu Oct 07 '20 at 08:06