4

I'm new to Cascading/Hadoop and am trying to run a simple example in local mode (i.e. in memory). The example just copies a file:

    import java.util.Properties;

    import cascading.flow.Flow;
    import cascading.flow.FlowConnector;
    import cascading.flow.FlowDef;
    import cascading.flow.local.LocalFlowConnector;
    import cascading.pipe.Pipe;
    import cascading.property.AppProps;
    import cascading.scheme.hadoop.TextLine;
    import cascading.tap.Tap;
    import cascading.tap.hadoop.Hfs;

    public class CascadingTest {

            public static void main(String[] args) {
                Properties properties = new Properties();


                AppProps.setApplicationJarClass( properties, CascadingTest.class );
                FlowConnector flowConnector = new LocalFlowConnector();

                // create the source tap
                Tap inTap = new Hfs( new TextLine(), "D:\\git_workspace\\Impatient\\part1\\data\\rain.txt" );

            // create the sink tap
            Tap outTap = new Hfs( new TextLine(), "D:\\git_workspace\\Impatient\\part1\\data\\out.txt" );

            // specify a pipe to connect the taps
            Pipe copyPipe = new Pipe( "copy" );

            // connect the taps, pipes, etc., into a flow
            FlowDef flowDef = FlowDef.flowDef()
                .addSource( copyPipe, inTap )
                .addTailSink( copyPipe, outTap );

                // run the flow
            Flow flow = flowConnector.connect( flowDef );
            flow.complete();
        }
    }

Here is the error I'm getting:

09-25-12 11:30:38,114 INFO  - AppProps                     - using app.id: 9C82C76AC667FDAA2F6969A0DF3949C6
Exception in thread "main" cascading.flow.planner.PlannerException: could not build flow from assembly: [java.util.Properties cannot be cast to org.apache.hadoop.mapred.JobConf]
    at cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:515)
    at cascading.flow.local.planner.LocalPlanner.buildFlow(LocalPlanner.java:84)
    at cascading.flow.FlowConnector.connect(FlowConnector.java:454)
    at com.x.y.CascadingTest.main(CascadingTest.java:37)
Caused by: java.lang.ClassCastException: java.util.Properties cannot be cast to org.apache.hadoop.mapred.JobConf
    at cascading.tap.hadoop.Hfs.sourceConfInit(Hfs.java:78)
    at cascading.flow.local.LocalFlowStep.initTaps(LocalFlowStep.java:77)
    at cascading.flow.local.LocalFlowStep.getInitializedConfig(LocalFlowStep.java:56)
    at cascading.flow.local.LocalFlowStep.createFlowStepJob(LocalFlowStep.java:135)
    at cascading.flow.local.LocalFlowStep.createFlowStepJob(LocalFlowStep.java:38)
    at cascading.flow.planner.BaseFlowStep.getFlowStepJob(BaseFlowStep.java:588)
    at cascading.flow.BaseFlow.initializeNewJobsMap(BaseFlow.java:1162)
    at cascading.flow.BaseFlow.initialize(BaseFlow.java:184)
    at cascading.flow.local.planner.LocalPlanner.buildFlow(LocalPlanner.java:78)
    ... 2 more
Clayton
  • 6,089
  • 10
  • 44
  • 47

3 Answers3

4

Just to provide a bit more detail: You can't mix local and hadoop classes in Cascading, as they assume different and incompatible environments. What's happening in your case is that you're trying to create a local flow with hadoop taps, the latter expecting a hadoop JobConf instead of the Properties object used to configure local taps.

Your code will work if you use cascading.tap.local.FileTap instead of cascading.tap.hadoop.Hfs.

ericschwarzkopf
  • 651
  • 5
  • 5
1

Welcome to Cascading -

I just answered on the Cascading user list, but in brief the problem is a mix of local and Hadoop mode classes.. This code has LocalFlowConnector, but then uses Hfs taps.

When I revert back to the classes used in the "Impatient" tutorial, it run correctly: https://gist.github.com/3784194

Paco
  • 602
  • 1
  • 9
  • 19
  • Thanks. I am attempting to run this in local mode - could you show an example of what that would look like? – Clayton Sep 25 '12 at 20:41
  • Got some code in progress to illustrate local mode as a next part in the "Impatient" series. http://www.cascading.org/category/impatient/ – Paco Nov 18 '12 at 16:58
  • @Pacoid I have posted a question. http://stackoverflow.com/questions/15988091/getting-cascading-tap-hadoop-io-multiinputsplit-class-not-found-exception-while please have a look and answer. – Mohammad Adnan Apr 13 '13 at 13:05
0

Yes, you need to use LFS(Local File System) tap instead of HFS (Hadoop File System).

Also you can test your code using Junit test cases (with cascading-unittest jar) in local mode itself/ from eclipse.

http://www.cascading.org/2012/08/07/cascading-for-the-impatient-part-6/