Questions tagged [cascading]

Cascading is a Query API, Query Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on a Hadoop cluster.

Cascading is a Query API, Query Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on a Hadoop cluster.

Cascading is a thin Java library that sits on top of Hadoop's MapReduce layer and is executed from the command line like any other Hadoop application. It is not a new text based query syntax (like Pig) or another complex system that must be installed on a cluster and maintained (like Hive). Though Cascading is both complimentary to and is a valid alternative to either application.

Cascading lets the developer quickly assemble complex distributed data-processing applications without having to "think" in MapReduce. And to efficiently schedule them based on their dependencies. Obviously simple data processing applications are supported as well, as complex applications tend to start simple.

Cascading is Open Source and dual licensed under the GPL and OEM/Commercial Licenses. OEM/Commercial Licenses and Developer Support can be obtained through Concurrent, Inc.

Cascading has a strong community of users and contributors, see our Cascading modules page for related projects and extensions.

Cascading, extensions, and related libraries are also hosted in the Conjars maven repository maintained by Concurrent, Inc. The repository is open to the public.

Cascading application-stack overview: enter image description here

Links:

364 questions
6
votes
3 answers

Write to multiple outputs by key Scalding Hadoop, one MapReduce Job

How can you write to multiple outputs dependent on the key using Scalding(/cascading) in a single Map Reduce Job. I could of course use .filter for all the possible keys, but that is a horrible hack, which will fire up many jobs.
samthebest
  • 30,803
  • 25
  • 102
  • 142
6
votes
2 answers

Deleting Relationship Objects with Cascade in Core Data

I'm looking to perform some simple deletion with Core data but just need a bit of advice on this one please. I have a model with Transaction, Name, Event and Date Entities. The Transaction has a link to each of the other Entities. In the app, when…
amitsbajaj
  • 1,304
  • 1
  • 24
  • 59
6
votes
1 answer

Scalding: How to retain the other field, after a groupBy('field){.size}?

So my input data has two fields/columns: id1 & id2, and my code is the following: TextLine(args("input")) .read .mapTo('line->('id1,'id2)) {line: String => val fields = line.split("\t") …
jeremy.ting
  • 155
  • 1
  • 1
  • 7
5
votes
1 answer

Hadoop: Split metadata size exceeded 10000000

When I ran a cascading job, I get an error: Split metadata size exceeded 10000000 I try to increase the limit on a per job level by passing the following to commandline xxx.jar -D mapreduce.job​.split.metainfo.maxsi‌​ze=30000000 I also…
user2628641
  • 2,035
  • 4
  • 29
  • 45
5
votes
3 answers

CSS Not Taking Effect on Page

I'm a new web design student and I just learned about Cascading Style sheets and how to link them externally; however, I'm encountering a problem. I have the file linked in my with no problem and the file directory is correct, but my changes…
Andrue
  • 688
  • 3
  • 11
  • 27
4
votes
2 answers

Are there any language syntax for cascading data other than style?

While CSS could only set styling and mainly used with HTML. I think it should be possible to use a concept of selector and cascading to apply value to xml attribute Is there any standardized or proposed syntax for this kind of concept?
Thaina Yu
  • 1,372
  • 2
  • 16
  • 27
4
votes
1 answer

Shapely Polygon Union with Holes Result

I'm looking for approach to merge polygons which produce holes that covers between polygons. I already tried to use cascading union in Shapely, but it gives polygon that cover holes. Reference question : polygon union without holes expected output
4
votes
1 answer

How to create Cascading Drop Down (Country and State list) In Angular 6

How to create Cascading Drop Down (Country and State list) In Angular 6. I want a fully country and there state list in angular 6. anyone who know that pl z share your idea.
user10153722
4
votes
2 answers

How to visualize steps of a scalding job

My scalding job is translated into 9 map reduce jobs (m/r jobs). It's not easy for me to understand which part of code each m/r job represents. Is there anything that could help me understand my job better? //this has been copy&pasted from our…
Oleksii
  • 1,101
  • 7
  • 12
4
votes
2 answers

Is it possible to temporarily disable cascading for a Hibernate entity?

Given a Hibernate/JPA entity with cascading set to ALL for a related entity: @Entity public class Entity { @OneToMany(cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "entity") private Set relatedEntities; } Is…
Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
4
votes
1 answer

how to perform an operation one time only at the end of a scalding job?

I read in scalding groupAll docs: /** * Group all tuples down to one reducer. * (due to cascading limitation). * This is probably only useful just before setting a tail such as Database * tail, so that only one reducer talks to…
Jas
  • 14,493
  • 27
  • 97
  • 148
4
votes
2 answers

Create Scalding Source like TextLine that combines multiple files into single mappers

We have many small files that need combining. In Scalding you can use TextLine to read files as text lines. The problem is we get 1 mapper per file, but we want to combine multiple files so that they are processed by 1 mapper. I understand we need…
samthebest
  • 30,803
  • 25
  • 102
  • 142
4
votes
1 answer

Programmatically determine Field names of Scalding/Cascading Pipe

I'm using Scalding to process records with many (> 22) fields. At the end of the process, I'd like to write out the final Pipe's field names to a file. I know this is possible as Mapper and Reducer logs show this information. I'd like to get this…
Ben Sidhom
  • 1,548
  • 16
  • 25
4
votes
1 answer

Hadoop: File Copy with Cascading 2.5.1 and Hadoop 2.2.0

I have recently set up a pseudo-distributed hadoop 2.2.0 cluster on my Mac OSX following this guide. Then, I tried the basic Cascading file copy with Cascading 2.5.1 However when I compiled the project using maven, I got the following…
David Williams
  • 8,388
  • 23
  • 83
  • 171
4
votes
1 answer

How do I pass arguments to an Oozie action using oozie.launcher.action.main.class?

Oozie has a config property called oozie.launcher.action.main.class where you can pass in the name of a "main class" for a map-reduce action (or a shell action), like so:
quux00
  • 13,679
  • 10
  • 57
  • 69
1
2
3
24 25