2

I have installed Tachyon and Spark according to instructions:

http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html

However, as a newbie I have no idea how to put file "X" into Tachyon File System as they said:

$ ./spark-shell
$ val s = sc.textFile("tachyon-ft://stanbyHost:19998/X")
$ s.count()
$ s.saveAsTextFile("tachyon-ft://activeHost:19998/Y")

What I did was to point to an existing file (that I find through the management UI):

scala> val s = sc.textFile("tachyon-ft://localhost:19998/root/default_tests_files/BasicFile_THROUGH")
s: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:21

When I run count, I got this below error:

scala> s.count()
java.lang.NullPointerException: connectionString cannot be null

I assume my path was wrong. So two questions:

  1. How to copy a file into Tachyon?

  2. What is the proper path for its FS?

Sorry, very very newbie !!

UPDATE 1

I am not sure if tachyon-ft://localhost:19998/root/default_tests_files/BasicFile_THROUGH is correct path. I cannot get it either via the browser or wget

This is what I saw in the file system browser

enter image description here

dtolnay
  • 9,621
  • 5
  • 41
  • 62
HP.
  • 19,226
  • 53
  • 154
  • 253
  • Can you access the source file yourself via the given URL? – Ashalynd Oct 09 '15 at 00:17
  • How do I access the file in URL? I have updated the question. I guess if I do s.take(4) and it didn't work, that means the path is wrong or something. Because when I put a random string inside `sc.textFile`, it was the same error. – HP. Oct 10 '15 at 00:44
  • Which versions of Spark, Tachyon, Hadoop are you using? – mattinbits Oct 12 '15 at 07:45

1 Answers1

2

I found out the issue. I didn't do this

sc.hadoopConfiguration.set("fs.tachyon.impl", "tachyon.hadoop.TFS")

After I went through this exercise http://ampcamp.berkeley.edu/5/exercises/tachyon.html#run-spark-on-tachyon, I found out the proper path is this:

val file = sc.textFile("tachyon://localhost:19998/LICENSE")

So my setup was fine afterall. The documentation here http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html was causing me a lot of confusion.

eliasah
  • 39,588
  • 11
  • 124
  • 154
HP.
  • 19,226
  • 53
  • 154
  • 253