1

I am trying to display some content on the console in a scalding script. When I run the same logic in the scalding shell I get the desired output and when I run the script I get an error: scripttest.scala:4: error: value dump is not a member of com.twitter.scalding.typed.TypedPipe[String]

The script is

 import com.twitter.scalding._
 class scripttest(args:Args) extends Job(args){
 val hello = TypedPipe.from(TextLine("tutorial/data/hello.txt"))
 hello.dump
 }

When I ran the same logic in console, it ran successfully. The output in console: Hello world Goodbye world

Please explain why this occurs and how to print to console in a scalding script.

1 Answers1

2

After looking closely at the documentation, you will see in section "2.6 REPL Reference", subsection "2.6.1 Enrichments available on TypedPipe/Grouped/CoGrouped objects" :

.dump: Print the contents to stdout (uses .toIterator)

hence dump is available only in the REPL.

I don't see a "scalding way" to write on the console, nor do I think it would make sense: you are running a pipeline, so the only "guaranteed milestone" is the end of the pipeline, when you can just write your results into a file, as is done on all the tutorial scripts.

If it's just a matter of printing "hello my job started", remember it's just a Scala file and use println (for more advanced logging, Logback is your friend).

To run the script locally, after having cloned the repository:

> ./sbt assembly
> ./scripts/scald.rb --local MyScript.scala

The first line will run all tests and build "scald.rb", the script used in the second line to run your scalding script.

John K
  • 1,285
  • 6
  • 18
  • I need to see the intermediate contents of a pipe to write test cases for this case. – Rahul Vatsa Mar 10 '16 at 12:35
  • You might be looking for "snapshots" in the REPL. They have a dedicated subsection in section "1.1.2 Scalding REPL" of the [documentation](https://media.readthedocs.org/pdf/scalding/readthedocs/scalding.pdf) - or you can also look [here](https://github.com/twitter/scalding/wiki/Scalding-REPL). An alternative would be, in your script, to cut your pipeline after the point where you want to see the data (aka comment part of the code) and `write` it in a file. – John K Mar 10 '16 at 12:44
  • Thanks This is exactly What I was looking for – Rahul Vatsa Mar 10 '16 at 12:54
  • Cool. If it answers your question feel free to accept it. Cheers – John K Mar 10 '16 at 13:39