Questions tagged [multidplyr]

multidplyr is an R package by Hadley Wickham that enables parallel processing on partitioned data.frames. This tag should not be used for dplyr-only questions.

multidplyr is an R package by Hadley Wickham that enables parallel processing on partitioned data.frames. It is a complement to his popular dplyr package and part of the extended tidyverse ecosystem of packages.

51 questions

votes

0 answers

Can't convert an environment to function error when using multidplyr

This is an example of usage of a multidplyr call in my code, that I run on my institute's cluster: #create data set.seed(1) library(dplyr) df <- do.call(rbind,lapply(1:100,function(i){ id.df <-…

asked Oct 19 '18 at 22:15

dan

6,048
10
57
125

votes

1 answer

Run breakpoint (lm) detection in parallel in R

I am doing about 80000 time series breakpoint detection calculations in R. I have all these extremely different time series where I cannot apply ARIMA models so I am calculating a linear model per time series, then extract the breakpoints and use…

r dplyr breakpoints doparallel multidplyr

asked Oct 07 '18 at 18:36

Jonathan

votes

1 answer

multidplyr error with pmap_dfr: Error: Element 5 is not a vector (environment)

[ This is also reported on the multidplyr github page ] I'm trying to use multidplyr_0.0.0.9000 with dplyr_0.7.4.9000 and pmap_dfr from purrr_0.2.4.9000. The following code (without using multidplyr) works fine: grid1 = as_tibble(expand.grid(m1 =…

r dplyr purrr multidplyr

asked Nov 02 '17 at 00:21

kartik_subbarao

vote

1 answer

Error with rep using multidplyr: cannot find function "n"

I'm trying to expand a dataframe based on the value of a column, using parallel cores with multidplyr (using dplyr). Since the command uncount() does not work with multidplyr, I am using default rep function. I get an error. Below a MWE, where I…

dplyr multidplyr

asked Aug 17 '23 at 17:17

luchonacho

6,759
4
35
52

vote

1 answer

how to merge two data frame by rows of x and y but columns should be side (df1$x) by side (df2$y)?

I have two dataframes with same name of columns and rows. I would like to merge them by rows but columns need to be side by side as of df$x and df$y. I tried so far but not getting output as required. merge(df.test1, df.test2, by.x = "V1", by.y =…

r dplyr multidplyr

asked Feb 23 '21 at 21:48

RKK

vote

1 answer

merge multiple table with different length and form a single table in R

i am using plumber api for an api. i have multiple sub-tables in which all table are connected with there primary keys (study_id) and i wanted to merge all table with single primary keys to form a single table. Some tables have different length. for…

r api plumber rjson multidplyr

asked Dec 27 '20 at 19:28

Aman Vishwakarma

vote

1 answer

R multidplyr for summarise_at work around?

I want to use multidplyr, and it has yet to have anything for summarise_at. i have hundreds if not thousands, so the summarise_at is necessary, but unfortunately, not available in multidplyr. looking for an alternative to work around…

r dplyr multidplyr

asked Jul 25 '20 at 13:12

Choc_waffles

vote

0 answers

How do you deal with errors in parition?

I am attempting to partition my data-set such that all members of a group are sent to the same core, I am following online tutorials verbatim but there seems to be an issue. The Error is : Error in partition(., group, cluster = clust) : unused…

r parallel-processing dplyr multidplyr

asked Dec 12 '19 at 20:53

Dominic Naimool

vote

1 answer

Multiply columns in different dataframes

I am writing a code for analysis a set of dplyr data. here is how my table_1 looks: 1 A B C 2 5 2 3 3 9 4 1 4 6 3 8 5 3 7 3 And my table_2 looks like this: 1 D E F 2 2 9 3 I would love to based on table 1 column"A", if A>6, then create a…

r dplyr multidplyr

asked Jul 25 '19 at 03:10

Bomber Gay

vote

0 answers

checkpoint can not find multidplyr in R-markdown

I'm trying to create an R-markdown document in which I will be running multidplyr. In order to ensure reproducability I decided to use the checkpoint library. MWE: --- title: "A great title" author: "A great author" date: "February 19, 2019" output:…

r r-markdown tidyverse checkpoint multidplyr

asked Feb 19 '19 at 17:11

Baraliuh

vote

2 answers

Vectorizing with multidplyr does not render the correct output

I tried to parallelize ape::dist_topo(), a function to compute distances between unrooted trees. Normally the function works like this (reprex: 4 random trees with 5 leaves each): library(tidyverse) #…

r dplyr tidyverse multidplyr

asked Jun 08 '18 at 15:00

abichat

2,317
2
21
39

vote

2 answers

Grouping dataframe in 12 groups with same column values

I have a large dataset with about 15 columns and more than 3 million rows. Because the dataset is so big, I would like to use multidplyron it . Because of the data, it would be impossible to just split my data frame to 12 parts. Lets say that there…

r multithreading dataframe multidplyr

asked Sep 18 '17 at 13:25

Ravonrip

vote

0 answers

Groupwise Identification of peaks using findpeak function from Pracma Package for Moving Average Getting error MISSING VALUE WHERE TRUE/FALSE

Reproducible Data As shown Below: library(pracma);library(zoo) library(dplyr);library(tidyverse) Tag<- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5,…

r dplyr multidplyr

asked Aug 11 '17 at 07:28

Harvey

vote

1 answer

Restructuing and formatting data frame columns

dfin <- ID SEQ GRP C1 C2 C3 T1 T2 T3 1 1 1 0 5 8 0 1 2 1 2 1 5 10 15 5 6 7 2 1 2 20 25 30 0 1 2 C1 is the concentration (CONC) at T1 (TIME) and so on. This…

r data.table dplyr multidplyr

asked Aug 03 '17 at 00:39

daragh

vote

1 answer

multidplyr: trial custom function

I'm trying to learn to run a custom function through multidplyr::do() on a cluster. Consider this simple self contained example. For example's sake, I'm trying to apply my custom function myWxTest to each common_dest (destinations with more than 50…

r parallel-processing dplyr multidplyr

asked Apr 24 '17 at 22:26

user189035

5,589
13
52
112

Prev 1

3 4 Next