Questions tagged [revolution-r]

Revolution R is production-grade analytics software built upon the powerful open source R statistics language.

84 questions
1
vote
1 answer

Revolution/rxMerge and duplication of rows

I am trying to merge two xdf files after subsetting a long table that has duplicate ids based on a variable. Assume I have two columns: id and type I subset the original xdf table based on say type = 'type1', and get first xdf file I subset the…
1
vote
2 answers

RevoscaleR for SQL Server to fetch data from views

Hi I am using RevoScaleR package from Revolution Analytics and I find it quite odd that the functions that are available for sql server objects are very limited . for example: RxSqlServerData does not support querying from a view . I have a view…
Bg1850
  • 3,032
  • 2
  • 16
  • 30
1
vote
1 answer

R: Lapply equivalent for Revoscaler/Revolution Enterprise?

Have Revolution Enterprise. Want to run 2 simple but computationally intensive operations on each of 121k files in a directory, outputting to new files. Was hoping to use some Revoscaler function that chunked/parallel processed the data similarly…
1
vote
1 answer

Create summary stats using Revolution R/ScaleR

im new to ScaleR/RevoR - I have a .xdf data set that has 400+ 'parts' and each part has 70000 numerical values....so the data set is quite large >40 million rows. I'd like to use RevoR to give me the Median & Mode for each 'part'...I can get the …
PaulBeales
  • 475
  • 1
  • 8
  • 21
1
vote
2 answers

Error in rxImport: Expected 8

I am trying to read a file with 35.000.000 rows and 105 columns in R and decided to use Revolution R Enterprise 7.4, with this code: input <- RxTextData(data, isFixedFormat = F,delimiter = "\t") s <- rxImport(inData = input,outFile =…
narteaga
  • 147
  • 2
  • 12
1
vote
1 answer

R- set working directory to hdfs

I need to create some data frames from very large data sets in R. Is there a way to change my working directory so that R objects that I create are saved into hdfs? I don't have enough space under /home to save these large data frames, but I need…
Laura
  • 320
  • 1
  • 4
  • 12
1
vote
2 answers

rxDataStep using lagged values

In SAS its possible to go through a dataset and used lagged values. The way I would do it is to use a function that does a "lag", but this presumably would produce a wrong value at the beginning of a chunk. For example if a chunk starts at row…
grad student
  • 107
  • 7
1
vote
1 answer

Changing the column positions of an xdf dataset

I would like to know if there is a way to reorder the column positions of an xdf dataset. For example, if I have an xdf dataset with columns [,a],[,c],[,b], I would like to reorder the columns to [,a],[,b],[,c] without having to create a dataframe,…
Ruser
  • 245
  • 3
  • 11
1
vote
1 answer

Set delimiter for rxImport

I am currently trying to use the rxImport: library(RevoScaleR) dataDir<-"D:/NYSE/" mycsv <- file.path(dataDir, "TAQ_TNQ_OPR_DER_ALL_01M_20130403_01.txt") output<-file.path(dataDir, "data.xdf") rxImport(inData = mycsv, outFile = output, overwrite =…
firstever
  • 49
  • 1
  • 6
1
vote
2 answers

rxDataStep transforms argument fails on user-defined functions

For example: require(RevoScaleR) # Create a data frame set.seed(100) myData = data.frame(x = 1:100, y = rep(c("a", "b", "c", "d"), 25), z = rnorm(100), w = runif(100)) # Create a multi-block .xdf file from the data…
zkurtz
  • 3,230
  • 7
  • 28
  • 64
1
vote
0 answers

How to group a XDF file and concatenate the strings in the columns?

I am new to revoScaleR and would like to know who I can group a XDF file and concatenate the values of a column in a group. For example if I have data like A B 1 one 1 two 2 three 2 four I would like the following output A B 1 one,two 2 three,four
1
vote
2 answers

integrating hadoop, revo-scaleR and hive

I have a requirement to fetch data from HIVE tables into a csv file and use it in RevoScaleR. Currently we pull the data from HIVE and manually put it into a file and use it in unix file system for adhoc analysis, however, the requirement is to…
user3914559
  • 25
  • 1
  • 7
1
vote
1 answer

environment not behaving as expected after using transformEnvir in RevoScaleR function

I have a function where I'm reading an xdf file using rxXdfToDataFrame and using a variable in my expression for rowSelection. If I don't pass transformEnvir=environment(), the variable is not found. My problem is that after calling the function…
user3747260
  • 465
  • 1
  • 5
  • 14
1
vote
0 answers

Using rxPredict with rxLogit in Revolution R

I used rxLogit function in Revolution R(Package RevoScaleR) to fit a logistic regression model on a data that has many categorical variables(for example STATE-IL,FL,OH,CA,TX..) and a couple of numeric variables. When I am trying to score a data…
Sid
  • 251
  • 2
  • 4
  • 17
1
vote
3 answers

How to cut a variable to 20 equal segments (for example) for several columns in a dataset in R

I know how to do it for one single variable. We can use equal.count() or a combination of quantile() and cut(). anyone knows an aggregate function to do this for 100 columns at the same time? I know I can write a loop but it is slow. Is there a…
zhifff
  • 199
  • 1
  • 4
  • 15