4

this post is related to my earlier How to define a Python Class which uses R code, but called from rTorch? .

I came across the torch package in R (https://torch.mlverse.org/docs/index.html) which allows to Define a DataSet class definition. Yet, I also need to be able to define a model class like class MyModelClass(torch.nn.Module) in Python. Is this possible in the torch package in R?

When I tried to do it with reticulate it did not work - there were conflicts like

  ImportError: /User/homes/mreichstein/miniconda3/envs/r-torch/lib/python3.6/site-packages/torch/lib/libtorch_python.so: undefined symbol: _ZTINSt6thread6_StateE

It also would not make much sense, since torch isn't wrapping Python.

But it is loosing at lot of flexibility, which rTorch has (but see my problem in the upper post). Thanks for any help! Markus

MR_MPI-BGC
  • 265
  • 3
  • 11

1 Answers1

3

You can do that directly using R's torch package which seems quite comprehensive at least for the basic tasks.

Neural networks

Here is an example of how to create nn.Sequential like this:

library(torch)

model <- nn_sequential(
    nn_linear(D_in, H),
    nn_relu(),
    nn_linear(H, D_out)
)

Below is a custom nn_module (a.k.a. torch.nn.Module) which is a simple dense (torch.nn.Linear) layer (source):

library(torch)

# creates example tensors. x requires_grad = TRUE tells that 
# we are going to take derivatives over it.
dense <- nn_module(
  clasname = "dense",
  # the initialize function tuns whenever we instantiate the model
  initialize = function(in_features, out_features) {
    
    # just for you to see when this function is called
    cat("Calling initialize!") 
    
    # we use nn_parameter to indicate that those tensors are special
    # and should be treated as parameters by `nn_module`.
    self$w <- nn_parameter(torch_randn(in_features, out_features))
    self$b <- nn_parameter(torch_zeros(out_features))
    
  },
  # this function is called whenever we call our model on input.
  forward = function(x) {
    cat("Calling forward!")
    torch_mm(x, self$w) + self$b
  }
)

model <- dense(3, 1)

Another example, using torch.nn.Linear layers to create a neural network this time (source):

two_layer_net <- nn_module(
   "two_layer_net",
   initialize = function(D_in, H, D_out) {
      self$linear1 <- nn_linear(D_in, H)
      self$linear2 <- nn_linear(H, D_out)
   },
   forward = function(x) {
      x %>% 
         self$linear1() %>% 
         nnf_relu() %>% 
         self$linear2()
   }
)

Also there are other resources like here (using flow control and weight sharing).

Other

Looking at the reference it seems most of the layers are already provided (didn't notice transformer layers at a quick glance, but this is minor).

As far as I can tell basic blocks for neural networks, their training etc. are in-place (even JIT so sharing between languages should be possible).

Szymon Maszke
  • 22,747
  • 4
  • 43
  • 83
  • Great - how could I be so blind.... . Yet, is it true, that no "general" Python Class can be generated, e.g. from another super class, if not implemented in torch? Say e.g. an EarlyStopping class or so. – MR_MPI-BGC Dec 30 '20 at 06:58
  • @MR_MP-BGC I'm not sure what you mean in this case. There is no EarlyStopping in PyTorch (there is in PyTorch Lightning or a-like, which does not exist for R). If you ask whether we can use, we would need `reticulate` and it probably wouldn't work (as it requires `LightningModule` which is based on `nn.Module`). If you ask whether we can code that, yeah, sure, why not? It's just a control flow checking validation loss, similar to what is shared [here](https://torch.mlverse.org/docs/articles/getting-started/control-flow-and-weight-sharing.html). – Szymon Maszke Dec 30 '20 at 12:21
  • @MR_MPI-BGC Also you can code anything you can code in Python (it might be easier or harder, faster or slower) as both languages are Turing complete. – Szymon Maszke Dec 30 '20 at 12:22
  • Thanks, yes in can code that also in R (I did....), but I was thinking to use as much of the Pytorch ecosystem in Python as possible (e.g. Lightning, but possibly also other classes ==> leveraging all the parallelization etc. (?) ), and as much of R as possible for data prep and plotting. Using Python classes directly and `reticulate` also helps sharing with colleagues working in Python.... There are always tradeoffs.... – MR_MPI-BGC Dec 31 '20 at 06:13
  • You can do preprocessing and cleaning in R, save the data, load them with Python and train neural network on them. No need for any tradeoffs. – Szymon Maszke Dec 31 '20 at 12:05