Questions tagged [rhadoop]

RHadoop is combination of R and Hadoop to manage and analyze data with Hadoop

RHadoop is a collection of three R packages that allow users to manage and analyze data with Hadoop. The packages have been implemented and tested in Cloudera's distribution of Hadoop (CDH3) & (CDH4). and R 2.15.0. THe packages have also been tested with Revolution R 4.3, 5.0, and 6.0. For rmr see Compatibility.

Source: Github: Revolution Analytics (RHadoop)

112 questions
1
vote
1 answer

Install / Configure RevolutionAnalytics / RHadoop on Windows 7 & hortonworks sandbox

I have installed VMware Player & Hortonworks Sandbox for Hadoop. Now I need help to configure / run RHadoop on that. I need to work with R and Hadoop. Please help.Thanks in advance.
0
votes
1 answer

Redirect sh file out put to file using Runtime.getRuntime().exec

I have tried similar suggestions from stack overflow, still issue persist. I am executing following command from java public static void main(String[] args) throws Exception { try { String line; //String[]…
user7220859
  • 113
  • 2
  • 2
  • 9
0
votes
1 answer

org.apache.hadoop.security.AccessControlException: /user/rstudio (is not a directory)

I get this error when I try the command in R > f = hdfs.file('./foo.data', 'r'). When I run # hdfs dfs -ls /user/ I get Found 1 items -rw-r--r-- 3 rstudio supergroup 3974305 2019-11-09 19:06 /user/rstudio And when I try to create the…
Henrique Andrade
  • 855
  • 1
  • 12
  • 25
0
votes
1 answer

using R function in reduce phase

I'm trying to find correlation coefficient of data frame and work perfectly. Is there a problem finding correlation coefficient from data frame using cor(), or is it good to implement this code for large data? cc = function(input, output = NULL){ …
0
votes
0 answers

Rhadoop - wordcount output is coming but not in a readable format

I followed this link's wordcount program. The link is given by "Rhadoop - wordcount using rmr" Iam getting the output but it is not in readable format. I want key value pairs in my output. How do I get that. What modifications should I do to the…
0
votes
0 answers

R Hadoop counting

I'm new in R, and i've a problem with MapReduce rmr2. I've a file to read of this kind, where in each row, there is a date and some words (A,B,C..) : 2016-05-10, A, B, C, A, R, E, F, E 2016-05-18, A, B, F, E, E 2016-06-01, A, B, K, T, T, E, G, E, A,…
GIULIO
  • 41
  • 8
0
votes
0 answers

Rhadoop mapreduce for multiple input files

I'm building a mapreduce program, using R, that extracts the relevant features from a set of features in a dataset using genetic algorithm. I need to put many files as an input to my mapreduce job. My code below is my mapreduce program but it works…
Rania
  • 21
  • 4
0
votes
1 answer

How to install Rhadoop on R 3.3.2?

I tried the following mentioned by Jinith: How to install RHadoop packages (Rmr, Rhdfs, Rhbase)? But I got this exception: "Installing package into ‘/home/user/R/x86_64-pc-linux-gnu-library/3.2’ (as ‘lib’ is unspecified) Warning: invalid package…
mate
  • 41
  • 2
0
votes
1 answer

How can i perform normal R-functions for hadoop remote on SQL Server?

how can I perform normal R-Code on a SQL Server without using the Microsoft rx-functions? I think the ComputeContext "RxInSqlServer" isn't the right one? But I couldn't find good Information about the other ComputeContext-options. Is this possible…
user43348044
  • 305
  • 3
  • 15
0
votes
2 answers

Error running "hdfs.put()" in RHadoop

I am using RHadoop for my project on sentiment analysis. When I try to run hdfs.put() I am recieving the following error: Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : …
Anna
  • 3
  • 3
0
votes
0 answers

Having issues with RHADOOP?

I have checked the question : Rhadoop - wordcount using rmr and have tried the answer on my side. But it is giving a lot of issues. Here is the code: …
Jaffer Wilson
  • 7,029
  • 10
  • 62
  • 139
0
votes
0 answers

JSON as input in a mapreduce

I have a JSON file contains fields such as machine_id, category, and ... Category contains states of machines such as "alarm", "failure". I simply like to see how many times each machine_id has been reported using rmr2. For example, if I have the…
Hossein
  • 1
  • 2
0
votes
0 answers

RHadoop Map reduce job failed with the below error

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at…
0
votes
1 answer

RHadoop - Rstudio - Install arulesViz library

I'm trying to install the arulesViz library using RStudion in Cloudera Machine. For that I'm executing: install.packages("arulesViz", type = "source") But I'm getting the following error: ERROR: configuration failed for package ‘curl’ * removing…
Pedro Alves
  • 1,004
  • 1
  • 21
  • 47
0
votes
0 answers

R Hadoop Memory issue

I am trying to run a distributed implementation of kmeans clustering on hadoop with rmr2 (on a single-machine cluster with Hadoop 2.6.0-cdh5.4.2 in pseduo-distributed mode). As long as the data file size (on HDFS) is small (around 1000 data points)…
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63