Questions tagged [rhadoop]

RHadoop is combination of R and Hadoop to manage and analyze data with Hadoop

RHadoop is a collection of three R packages that allow users to manage and analyze data with Hadoop. The packages have been implemented and tested in Cloudera's distribution of Hadoop (CDH3) & (CDH4). and R 2.15.0. THe packages have also been tested with Revolution R 4.3, 5.0, and 6.0. For rmr see Compatibility.

Source: Github: Revolution Analytics (RHadoop)

112 questions
0
votes
0 answers

can't install rmr2 for Rhadoop

I'm having trouble installing rmr2. I'm following these instructions: https://github.com/RevolutionAnalytics/RHadoop/wiki/user%3Ermr%3EHome where rmr2 installation is step 4. I've already installed packages: install.packages(c("rJava", "Rcpp",…
0
votes
2 answers

String character in RHDFS output

The hdfs.write() command in rhdfs creates a file with a leading non-unicode character. The documentation doesn't describe the file type being written. Steps to recreate. 1. Open R and initialize rhdfs > ofile = hdfs.file("brian.txt", "w") >…
Brian Dolan
  • 3,086
  • 2
  • 24
  • 35
0
votes
1 answer

Error when running wordcount R example code on Hadoop

R wordcount example code: library(rmr2) map <- function(k,lines) { words.list <- strsplit(lines, '\\s') words <- unlist(words.list) return( keyval(words, 1) ) } reduce <- function(word, counts) { keyval(word,…
Jacky
  • 11
  • 4
0
votes
1 answer

rhdfs library doesn't work

i'm trying to use hadoop with R into a cloudera VM. i load rhdfs library into R and goes fine but when i try to execute hdfs.init() , this doesn't work and give me the following error: > hdfs.init() 14/12/10 05:48:20 ERROR…
ntrax
  • 457
  • 4
  • 22
0
votes
1 answer

Hortonworks Data Platform 2.1 (sandbox) unable to complete a very simple RHadoop job

I have installed rhdfs and rmr2 packages on top of Hortonworks Data Platform 2.1 (sandbox) on a 64-bit VM single node with 8 GM RAM allocated. When I tried to run the following very simple RHadoop job, it would take forever but never be able to…
john smith
  • 536
  • 4
  • 11
0
votes
1 answer

HDFS temp directory in rmr.options

I'm new in Hadoop so exuce me if the question is stupid. I have a local single-node cluster. I'm trying to execute a simple MapReduce job in RHadoop and I get this message: > wordcount('/data/complete_works_of_shakespeare.txt') Error creating temp…
0
votes
1 answer

How to input HDFS file into R mapreduce for processing and get the result into HDFS file

I have a question similar to the below link in stackoverflow R+Hadoop: How to read CSV file from HDFS and execute mapreduce? I am tring to read a file from location "/somnath/logreg_data/ds1.10.csv" in HDFS, reduce its number of columns from 10 to 5…
somnathchakrabarti
  • 3,026
  • 10
  • 69
  • 92
0
votes
2 answers

RHadoop reduce job failed

I am following RHadoop tutorial, https://github.com/RevolutionAnalytics/rmr2/blob/master/docs/tutorial.md and running the second example, but I am getting errors which I can't resolve. The code is as the following: groups =…
0
votes
1 answer

Rhadoop with Elasticsearch-hadoop

I am using hadoop with a database from ElasticSearch (no hdfs). Do you know if elasticsearch-hadoop can work together? Else do you know how using analytics for my project?
Charletg
  • 3
  • 3
0
votes
0 answers

Why does my RHadoop Installation cause errors in dyn.load?

I need to install RHadoop on my Ubuntu operating system. When I install the rmr2 package I am having this error library(rmr2) Loading required package: Rcpp Loading required package: RJSONIO Loading required package: digest Loading required…
user3550366
  • 125
  • 2
  • 13
0
votes
1 answer

Virtual machines containing RHadoop and the hadoop-streaming.jar

Getting a local test instance of Hadoop looks like a bit of a bear to configure, after consulting the following very clear, but still very complicated…
Mittenchops
  • 18,633
  • 33
  • 128
  • 246
0
votes
1 answer

RHadoop - java.lang.RuntimeException: Error in configuring object

Thanks for considering to answer this question. I am new to RHadoop. I have installed Hadoop 2.3.0 single node cluster on Windows 7 64 bit machine. I could successfully run map-reduce examples such as pi, wordcount. Subsequently I successfully…
0
votes
0 answers

RHadoop Stream Job Fail with Apache Oozie

I'm really just looking to pick the community's brain for some leads in figuring out what is going on with the issue I'm having. I'm writing a MR job with RHadoop (rmr2, v3.0.0) and things are great -- IO with HDFS, mapping, reducing. No problems. …
0
votes
2 answers

Is "Converting to.dfs argument to keyval with a NULL key" usually a fatal warning for failed map tasks with RHadoop?

I have written several RHadoop programs that work even though they return warnings such as: Converting to.dfs argument to keyval with a NULL key when inputting data with to.dfs. However, some programs fail fatally with no warnings other…
dataquerent
  • 267
  • 2
  • 5
  • 13
0
votes
0 answers

install R mapreduce rmr2 on centos

I am having problem installing rmr2 on centos, yum command by default installs R version 2.15, and rmr2 is dependent on Rcpp library which is only compatible with R 3.0 or later. This is my os version : $ uname -or 2.6.18-348.16.1.el5 GNU/Linux $…
bigData
  • 1,318
  • 4
  • 16
  • 27