Gpu processing R (How to use Gpu processing to run a function on subsets of a dataset)

Question

I have a large dataset (around 5 million observations). The observations record the total revenue from a specific event by different type of subevents denoted by "type". A small replication of the data is below:

Event_ID = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3)
Type=c("A","B","C","D","E","A","B","C","D","E","A","B","C","D")
Revenue1=c(24,9,51,7,22,15,86,66,0,57,44,93,34,37)
Revenue2=c(16,93,96,44,67,73,12,65,81,22,39,94,41,30)
z = data.frame(Event_ID,Type,Revenue1,Revenue2)

I would like to use GPU cores to run a function that I wrote (I have never attempted GPU processing, so am at a complete loss how to begin). The actual function takes a really long time to run. I am showing a very simple version of the function below:

Total_Revenue=function(data){
  full_list=list()
  event_list=unique(data[,'Event_ID'])
  for (event in event_list){
    new_data=list()
    event_data = data[which(data$Event_ID==event),]
    for (i in 1:nrow(event_data)){
      event_data[i,'Total_Rev'] = event_data[i,'Revenue1']+event_data[i,'Revenue2'] 
      new_data=rbind(new_data,event_data[i,])
    }
  full_list=rbind(full_list,new_data)
  }
  return(full_list)
}

Total = Total_Revenue(data=z)
print(Total)

This simplified version function proceeds as follows:

a) Break up the dataset into subsets such that each subset only takes 1 unique event.

b)For each observation, loop through all the observations and compute Revenue1+Revenue2.

c)Store the subsets and at the end return the new dataset.

Since I have no prior experience, I was looking at some of the R packages. I found the gpuR package and installed it. However, I am having difficulty in understanding how to implement this. Also the issue is that my coding background is very weak. I have self taught myself some things over the past year.

Any help/leads will be highly appreciated. I am open to using any alternate packages as well. Please let me know if I missed anything.

P.S. I also took a snapshot of my system using the following command:

str(gpuInfo())

I am attaching the output for your reference:

P.P.S. Please note that my actual function is a little complicated and long and it takes a long time to run which is why I want to implement gpu processing here.

score 0 · Answer 1 · answered Jan 21 '19 at 14:03

GPU programming is no silver bullet. It works well only for certain problems. That's why the gpuR package provides GPU base vectors and matrices allowing for linear algebra operations to be done using the GPU. This won't help you if your problem is no a linear algebra problem. However, note that many problems can be formulated as in such a way.

We cannot tell if your problem falls into this category, since you have (probably) over-simplfied your code:

> print(Total)
   Event_ID Type Revenue1 Revenue2 Total_Rev
1         1    A       24       16        40
2         1    B        9       93       102
3         1    C       51       96       147
4         1    D        7       44        51
5         1    E       22       67        89
6         2    A       15       73        88
7         2    B       86       12        98
8         2    C       66       65       131
9         2    D        0       81        81
10        2    E       57       22        79
11        3    A       44       39        83
12        3    B       93       94       187
13        3    C       34       41        75
14        3    D       37       30        67

Since Total_Rev is just the sum of Revenue1 and Revenue2, you could have done this more easily:

> z$Total_Rev <- z$Revenue1 + z$Revenue2
> z
   Event_ID Type Revenue1 Revenue2 Total_Rev
1         1    A       24       16        40
2         1    B        9       93       102
3         1    C       51       96       147
4         1    D        7       44        51
5         1    E       22       67        89
6         2    A       15       73        88
7         2    B       86       12        98
8         2    C       66       65       131
9         2    D        0       81        81
10        2    E       57       22        79
11        3    A       44       39        83
12        3    B       93       94       187
13        3    C       34       41        75
14        3    D       37       30        67

This is a simple form of vectorization, which helps you getting rid of (some) for loops. And since you outer for loop looks at different Event_ID, it might also make sense to look into grouping and aggregation techniques. These can be done with base R, the data.table package, with tidyverse/dplyr and possibly other tools. I am using the latter approach, since I fond its syntax the most newbie friendly. However, data.table might be the right tool for you if you have large data sets. So here a very simple aggregation that computes the average per Event_ID:

Event_ID = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3)
Type=c("A","B","C","D","E","A","B","C","D","E","A","B","C","D")
Revenue1=c(24,9,51,7,22,15,86,66,0,57,44,93,34,37)
Revenue2=c(16,93,96,44,67,73,12,65,81,22,39,94,41,30)
z = data.frame(Event_ID,Type,Revenue1,Revenue2)

library(dplyr)
z %>%
  mutate(Total_Rev = Revenue1 + Revenue2) %>%
  group_by(Event_ID) %>%
  summarise(average = mean(Total_Rev))
#> # A tibble: 3 x 2
#>   Event_ID average
#>      <dbl>   <dbl>
#> 1        1    85.8
#> 2        2    95.4
#> 3        3   103

Thanks for the comment. I admit that I do not understand GPU processing well (or you could say at all). Unfortunately, my problem is not a simple linear algebra operation. Thank you anyways. Is there a R package or some other way that I can break up a big operation and run it on individual GPU cores? Unfortunately I could not find any example other than matrix multiplications (which is not what I would like to do). You are correct in that I might have over-simplified my problem. My apologies. I wanted to understand how to use GPU for running an operation on subsets of a big dataset. — Prometheus, Jan 21 '19 at 16:25
@Prometheus Please not that the main efficiency gain from using the GPU comes when all cores act in lock-step. That is they execute exactly the same instructions just with different data. I am not sure your use case has that form. From what you write it sounds more like an aggregation problem. That can be efficiently solved using `data.table`, see e.g. https://stackoverflow.com/a/53366310/8416610. — Ralf Stubner, Jan 21 '19 at 16:33
Yes you are right. That is why I made sure that the "operation" for each event is run one at a time. I believe this is called embarrassingly parallel process. The "operation" is not a simple aggregation as I have shown here though. The "operation" was simplified for illustrative purposes. Thanks a lot. — Prometheus, Jan 21 '19 at 16:44

Gpu processing R (How to use Gpu processing to run a function on subsets of a dataset)

1 Answers1

Linked