Highest Voted 'scalding' Questions

2

votes

2 answers

java.lang.NullPointerException when reading s3 with Hadoop (Scalding)

Getting strange NPE when trying to read s3 with Scalding / Hadoop. The paths are 100% correct. Asking this question because it's surprisingly hard to Google and everytime I get this error I forget how I solved it. So posting on SO so I can Google…

asked Jun 26 '14 at 15:16

samthebest

30,803
25
102
142

2

votes

2 answers

Compress Output Scalding / Cascading TsvCompressed

So people have been having problems compressing the output of Scalding Jobs including myself. After googling I get the odd hiff of an answer in a some obscure forum somewhere but nothing suitable for peoples copy and paste needs. I would like an…

scala hadoop compression cascading scalding

asked May 29 '14 at 17:42

samthebest

30,803
25
102
142

2

votes

3 answers

scalding how to map on all fields with '* keyword?

I want to apply an operation to all fields of my Pipe. I saw on https://github.com/twitter/scalding/wiki/Fields-based-API-Reference that "You can use '* (here and elsewhere) to mean all fields." but somehow I do not succeed to make it work. Would…

scala scalding

asked Mar 12 '14 at 17:56

Mr Renard

43
4

2

votes

3 answers

Scalding Sample WordCount local mode

I am trying to run Scalding sample word count example. I have followed this github link for steps:- https://github.com/twitter/scalding/wiki/Getting-Started But I am getting ClassNotFoundException. Below is my StackTrace:- [cloudera@localhost…

scala twitter hadoop noclassdeffounderror scalding

asked Aug 21 '13 at 23:54

neham

341
5
18

2

votes

1 answer

Scalding MongoDB connector

I am using Scalding for ETL implementation and I am looking for a simple way to forward Scalding output to MongoDB instead of HDFS. Any suggestions appreciated. Thanks.

mongodb scalding

asked Aug 13 '13 at 13:38

David Greenshtein

508
4
17

2

votes

1 answer

scalding compare consecutive records

Does anyone know how to compare consecutive records in scalding when creating a schema. I am looking at tutorial 6 and suppose that I want to print the age of the person if data in record #2 is greater than record #1 (for all records) for…

scala enums scalding

asked Jun 16 '13 at 05:01

CruncherBigData

1,112
3
14
34

2

votes

2 answers

How does scalding pass user functions to remote MapReduce nodes

When working with Scalding, you have the ability to provide a function. I was wondering how scalding passes these functions to the remote map/reduce tasks? Is this using something in scala or something generic that can be done with anonymous…

java scala scalding

asked Mar 07 '13 at 17:19

ekaqu

2,038
3
24
38

2

votes

2 answers

Calculate sums of even/odd pairs on Hadoop?

I want to create a parallel scanLeft(computes prefix sums for an associative operator) function for Hadoop (scalding in particular; see below for how this is done). Given a sequence of numbers in a hdfs file (one per line) I want to calculate a new…

scala hadoop functional-programming cascading scalding

asked Jan 04 '13 at 20:46

John Salvatier

3,077
4
26
31

2

votes

5 answers

How to implement OR join in hadoop(scalding/cascading)

It is easy to join datasets by single key simply by sending join field as a reducer key. But joining records by several keys where at least one shoud be the same is not that easy for me. Example I have logs and I want to group them by user…

scala join hadoop cascading scalding

asked Sep 24 '12 at 22:13

yura

14,489
21
77
126

2

votes

1 answer

Reading from HBase with scalding

I'm very new to Cascading/Scalding, and cannot figure out, hot to read data from HBase. I have a table in HBase, where the hand history of poker games is stored (in a very straightforward manner: id -> hand, serialized with ProtoBuf). The job below…

cascading scalding

asked Aug 08 '12 at 05:16

Vasil Remeniuk

20,519
6
71
81

1

vote

1 answer

How to mock a TextLine for Scalding using the type safe API?

I am trying to mock a TextLine for a Scalding job, but the offset appears to be getting mixed in with the line, whether I express the offset explicitly or implicitly. Here is my job: package changed import com.twitter.scalding._ import…

mocking cascading scalding

asked Jul 25 '18 at 20:36

Ellen Spertus

6,576
9
50
101

1

vote

0 answers

Scalding Execution Monad - What is it & how to use it

I am working on Big Data technologies using MR based on Java. But recently my company has moved to Scalding framework. I am not able get my head around the Scalding Execution Monad. What it is and how it works. Cannot find much material on it on…

scala bigdata scalding

asked Feb 09 '18 at 10:20

learner4life

17
3

1

vote

0 answers

scald.rb results in error (could not find or load main class)

I am trying to run the tutorial files from https://github.com/twitter/scalding/tree/develop/tutorial. I cloned the 0.17.x branch and current develop branch and haven't had much success with either. I have also already ran "sbt update" and "sbt…

ruby scala sbt scalding

asked Oct 26 '17 at 18:42

DBD

11
4

1

vote

1 answer

How do I use HyperLogLogMonoid from Algebird to carry out arbitrary intersections and unions

I'd like to aggregate a bunch of values that belong to a particular category into an HLL data structure so I can carry out intersections and unions later and count resulting cardinality of such computations. I was able to get to the point where I…

scala scalding

asked Nov 07 '16 at 03:41

harshsinghal

3,720
8
35
32

1

vote

2 answers

How to override setup and cleanup methods in spark map function

Suppose there is following map reduce job Mapper: setup() initializes some state map() add data to state, no output cleanup() ouput state to context Reducer: aggregare all states into one output How such job could be implemented in spark?…

scala apache-spark scalding

asked Oct 09 '16 at 19:29

Julias

5,752
17
59
84

Questions tagged [scalding]