0

I've just started learning pyspark using standalone on local machine. I can't get the checkpoint to work. I boiled down the script to this....

spark = SparkSession.builder.appName("PyTest").master("local[*]").getOrCreate()

spark.sparkContext.setCheckpointDir("/RddCheckPoint")
df = spark.createDataFrame(["10","11","13"], "string").toDF("age")
df.checkpoint()

and I get this output...

>>> spark = SparkSession.builder.appName("PyTest").master("local[*]").getOrCreate()
>>>
>>> spark.sparkContext.setCheckpointDir("/RddCheckPoint")
>>> df = spark.createDataFrame(["10","11","13"], "string").toDF("age")
>>> df.checkpoint()
20/01/24 15:26:45 WARN SizeEstimator: Failed to check whether UseCompressedOops is set; assuming yes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "N:\spark\python\pyspark\sql\dataframe.py", line 463, in checkpoint
    jdf = self._jdf.checkpoint(eager)
  File "N:\spark\python\lib\py4j-0.10.8.1-src.zip\py4j\java_gateway.py", line 1286, in __call__
  File "N:\spark\python\pyspark\sql\utils.py", line 98, in deco
    return f(*a, **kw)
  File "N:\spark\python\lib\py4j-0.10.8.1-src.zip\py4j\protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o71.checkpoint.
: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
        at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
        at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
        at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
        at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1435)
        at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:493)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
        at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
        at org.apache.spark.rdd.ReliableCheckpointRDD.getPartitions(ReliableCheckpointRDD.scala:74)
        at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
        at org.apache.spark.rdd.ReliableCheckpointRDD$.writeRDDToCheckpointDirectory(ReliableCheckpointRDD.scala:179)
        at org.apache.spark.rdd.ReliableRDDCheckpointData.doCheckpoint(ReliableRDDCheckpointData.scala:59)
        at org.apache.spark.rdd.RDDCheckpointData.checkpoint(RDDCheckpointData.scala:75)
        at org.apache.spark.rdd.RDD.$anonfun$doCheckpoint$1(RDD.scala:1801)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1791)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2118)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2137)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2156)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2181)
        at org.apache.spark.rdd.RDD.count(RDD.scala:1227)
        at org.apache.spark.sql.Dataset.$anonfun$checkpoint$1(Dataset.scala:689)
        at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3472)
        at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$4(SQLExecution.scala:100)
        at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
        at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
        at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3468)
        at org.apache.spark.sql.Dataset.checkpoint(Dataset.scala:680)
        at org.apache.spark.sql.Dataset.checkpoint(Dataset.scala:643)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Unknown Source)

The error doesn't give any specifics about why it failed. I suspect I have missed some spark config but not sure what...

Rich
  • 1
  • 1

1 Answers1

0

You have this error because either is not created the checkpoint directory or you don't have permissions to write in this directory (because the checkpoint directory is under the root directory "/").

import os

os.mkdir("RddCheckPoint")
spark = SparkSession.builder.appName("PyTest").master("local[*]").getOrCreate()

spark.sparkContext.setCheckpointDir("RddCheckPoint")
df = spark.createDataFrame(["10","11","13"], "string").toDF("age")
df.checkpoint()
ggeop
  • 1,230
  • 12
  • 24
  • thanks. I tried that - it still fails with the same error. Interestingly, the folder is created and it does have content... `Directory of N:\tmp\RddCheckPoint\0769c686-9b14-415d-92b6-1da98dc9a4dd\rdd-9` `` `25/01/2020 17:12 .` `25/01/2020 17:12 ..` `25/01/2020 17:12 12 .part-00000.crc` etc – Rich Jan 25 '20 at 17:13
  • As I see it has content, .CRC file. What content are you expected? It's a check point not a .write(), the data are for spark internal use, you can not read them as a CSV. I tried exactly my answer code and it works perfectly. – ggeop Jan 25 '20 at 17:47
  • I wasnt expecting any particular content. Just meant that content suggests write access isnt the (remaining) issue but the exception and stacktrace still get spat out... So if it can write to the directory, what would cause the error? Thanks – Rich Jan 27 '20 at 11:28
  • Try the following, create a NEW checkpoint directory (preferably not in root directory) and run my snippet code. Maybe you have problem with metadata. Because, I run this code snippet without error. – ggeop Jan 27 '20 at 11:44
  • @Rich any news? – ggeop Jan 30 '20 at 06:56
  • Same error. Guess my installatiion must be misconfigured in some way... – Rich Feb 06 '20 at 13:24