Questions tagged [rhadoop]

RHadoop is combination of R and Hadoop to manage and analyze data with Hadoop

RHadoop is a collection of three R packages that allow users to manage and analyze data with Hadoop. The packages have been implemented and tested in Cloudera's distribution of Hadoop (CDH3) & (CDH4). and R 2.15.0. THe packages have also been tested with Revolution R 4.3, 5.0, and 6.0. For rmr see Compatibility.

Source: Github: Revolution Analytics (RHadoop)

112 questions
1
vote
2 answers

Write R data frame to Hadoop Hive

I want to write a data frame in R to a new table in Hadoop Hive. I'm using sqlSave() in the RODBC package as shown below. The table structure is created in Hadoop, but I get an error before any data is inserted into the table. Error message is…
elle_248
  • 11
  • 1
  • 2
1
vote
0 answers

SI model in Rhadoop

i want to measure the diffusion of information on my graph using SI model. i define a set of initial infected nodes. i was based on this code : Susceptible-Infected model for network diffusion to develop my appropriate. but when i run my code in…
Sasa88
  • 327
  • 1
  • 3
  • 15
1
vote
0 answers

Hadoop streaming command fails to work in R

I have installed hadoop 2.7.2 on ubuntu 16.04, and I have also installed Rstudio and Rhadoop (rmr2,rhdfs,rhbase) on a single node cluster. RHadoop packages are installed in this directory: "/home/hduser/R/x86_64-pc-linux-gnu-library/3.2/". however,…
Amir
  • 11
  • 3
1
vote
0 answers

Getting Data in and out of Rhipe [R + Hadoop]

I was trying out rhipe and RHadoop [rmr rhdfs rhbase etc.] series of packages. Now in both of the packages [rhipe and rmr] I can ingest / read the data stored into csv or text file. Both of them kind of supports creation of new file formats but I…
Indranil Gayen
  • 702
  • 1
  • 4
  • 17
1
vote
1 answer

Error in as(x, class(k)) : no method or default for coercing “NULL” to “data.frame”

I am currently facing an error mentioned below which is related to NULL values being coerced to a data frame. The data set does contain nulls, however I have tried both is.na() and is.null() functions to replace the null values with something else.…
1
vote
0 answers

R external libraries with RHadoop rmr2

I have this scenario: Hadoop Client node (R and rmr2 installed) Hadoop cluster (R and rmr2 in all nodes installed) No administrator privileges in cluster for installing external libraries This question is similar to Temporarily installing R…
user2558672
  • 87
  • 1
  • 7
1
vote
1 answer

java.lang.UnsupportedClassVersionError Unsupported major.minor version 51.0 rhdfs

I know this has to do with the the difference between the Java versions during compile and runtime, however I think I have set all the environments variables properly so I don't really know that is still causing this issue. $ java -version java…
angerhang
  • 327
  • 4
  • 13
1
vote
0 answers

How can we save an avro file with a json schema using RHadoop (rmr2)?

The sample implementation of avro output format using make.output.format uses "bytes" as schema. Instead I want to specify a json schema to the avro file. I could not find how we can do this. I guess there would be some backend.parameters which will…
Kumar Deepak
  • 473
  • 4
  • 18
1
vote
2 answers

How do I Install RHadoop package rhdfs from Github using Devtools

How do I install Rhadoop from Github using Devtools I am basically wanting to install rhdfs from https://github.com/RevolutionAnalytics/rhdfs but this doesnot work I tried the following >…
Ajay Ohri
  • 3,382
  • 3
  • 30
  • 60
1
vote
0 answers

"hadoop streaming failed with error code 5"

I have created a multi-node hadoop cluster using my two laptops and have successfully tested it. After that I have installed RHadoop upon the hadoop environment. All the necessary packages are installed and path variables are set. Then, trying to…
DatamineR
  • 10,428
  • 3
  • 25
  • 45
1
vote
1 answer

Connecting RStudio with Remote R machine

I have RStudio installed on my Windows machine and R installed on one of the nodes of a hadoop cluster. I want to connect RStudio to that slave machine and want to run my R script. I have all the R packages required for hadoop integration installed…
Shashi
  • 2,686
  • 7
  • 35
  • 67
1
vote
2 answers

R Mapreduce library 'rmr2' shows a warning message when loaded

Why is the R Mapreduce library 'rmr2' generating a warning message ? I have installed 'rmr2' library to execute Mapreduce programs in R. But when library(rmr2) is specified in R, it generates the following warning message: Please review your hadoop…
User456898
  • 5,704
  • 5
  • 21
  • 37
1
vote
2 answers

R is not connecting to HDFS

Why is R not connecting to Hadoop ? I am using R to connect to HDFS using 'rhdfs' package. The 'rJava' package is installed and rhdfs package is loaded. The HADOOP_CMD environment variable is set in R…
User456898
  • 5,704
  • 5
  • 21
  • 37
1
vote
1 answer

Getting error while running map reduce jobs in R

I just started integrating RHadoop. It is integrated R-studio server with Hadoop, but I am getting error while running map-reduce jobs. when I run following Line of code. library(rmr2) a <- to.dfs(seq(from=1, to=500, by=3),…
Akshaykumar Maldhure
  • 1,239
  • 1
  • 18
  • 32
1
vote
1 answer

Sorting Data using RHadoop

I'm pretty new in Hadoop & RHadoop. So, was trying to sort data in Mapreduce structure using RHadoop. But I can't sort the data. The code is given below. Can anybody please help me to find out where I'm making the mistake. The reason for trying this…
Beta
  • 1,638
  • 5
  • 33
  • 67