9

I found that we can build/reconstruct an external pointer from a memory address, see this example where I take a pointer from a data table object and rebuild it:

# devtools::install_github("randy3k/xptr")
iris_dt <- data.table::as.data.table(iris)
ptr1 <- attr(iris_dt, ".internal.selfref")
ptr1
#> <pointer: 0x13c00d4e0>
typeof(ptr1)
#> [1] "externalptr"

address <- xptr::xptr_address(ptr1)
address
#> [1] "0x13c00d4e0"
ptr2 <- xptr::new_xptr(address)
identical(ptr1, ptr2)
#> [1] TRUE

Obviously xptr::new_xptr("0x13c00d4e0") is not stable between sessions, I am aware that the above is not allocating memory but merely defining binding, this is fine for my use case.

I want to do the same with environments :

e <- new.env()
e
#> <environment: 0x10b5bf038>

env("0x10b5bf038") # I want this "env" function
#> <environment: 0x10b5bf038>

I doubt base R can do it so I'm opened to packaged options and C magic.

Unrequired reading addressing the X/Y comment

I need this for the {constructive} package, say I want to explore the asNamespace("stats")$.__NAMESPACE__.$DLLs object, the way it prints is not very helpful :

asNamespace("stats")$.__NAMESPACE__.$DLLs
#> $stats
#> DLL name: stats
#> Filename: /opt/R/4.2.1-arm64/Resources/library/stats/libs/stats.so
#> Dynamic lookup: FALSE

dput() is often ugly and brittle, additionally here the code is not syntactic so I cannot be sure that the object is accurately described.

dput(asNamespace("stats")$.__NAMESPACE__.$DLLs)
#> list(stats = structure(list(name = "stats", path = "/opt/R/4.2.1-arm64/Resources/library/stats/libs/stats.so", 
#>     dynamicLookup = FALSE, handle = <pointer: 0x2011ce960>, info = <pointer: 0x6000021f00c0>), class = "DLLInfo"))

str() does somewhat better but not ideal in the general case

str(asNamespace("stats")$.__NAMESPACE__.$DLLs)
#> List of 1
#>  $ stats:List of 5
#>   ..$ name         : chr "stats"
#>   ..$ path         : chr "/opt/R/4.2.1-arm64/Resources/library/stats/libs/stats.so"
#>   ..$ dynamicLookup: logi FALSE
#>   ..$ handle       :Class 'DLLHandle' <externalptr> 
#>   ..$ info         :Class 'DLLInfoReference' <externalptr> 
#>   ..- attr(*, "class")= chr "DLLInfo"

{constructive} guarantees that it outputs code that reproduces the object, and it now works for objects containing pointers.

constructive::construct(asNamespace("stats")$.__NAMESPACE__.$DLLs)
#> list(
#>   stats = list(
#>     name = "stats",
#>     path = "/opt/R/4.2.1-arm64/Resources/library/stats/libs/stats.so",
#>     dynamicLookup = FALSE,
#>     handle = constructive::external_pointer("0x2051d6960") |>
#>       structure(class = "DLLHandle"),
#>     info = constructive::external_pointer("0x600002970de0") |>
#>       structure(class = "DLLInfoReference")
#>   ) |>
#>     structure(class = "DLLInfo")
#> )

I have other ways in the package to handle environments, such as building equivalent environments from lists etc... But I want to integrate an alternative using only the memory address, because it would print nicer and is enough for some use cases.

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • 1
    On Windows 10, `identical(ptr1, ptr2)` returns `FALSE` in the first example because `ptr1` is `` and `address` is `000002641DF1EC20`. It probably won't help for your issue but you might find it relevant – bretauv Mar 29 '23 at 09:24
  • The point of external pointers is to point to things that aren't R objects, with the ability to have them managed by R's garbage collection. What you are doing is asking to have R allocate an object at a particular location; that's not possible. – user2554330 Mar 29 '23 at 09:42
  • I'm not asking to allocate objects, I want a binding. Hopefully my edit clarifies it. – moodymudskipper Mar 29 '23 at 09:48
  • 3
    This smells like an X/Y problem. What do you need to this *for*? Lookup with environments as keys? – Konrad Rudolph Mar 29 '23 at 09:56
  • @KonradRudolph hopefully my edit adresses how it is not an X/Y problem – moodymudskipper Mar 29 '23 at 11:00
  • @moodymudskipper Perfect, thanks! But if you are open to suggestions, if the point of ‘constructive’ is to be able to reconstruct R objects it might be better to “destructure” environments as if they were lists, rather than printing pointer addresses. Of course this requires care: if somebody refers to a package/module (;-)) namespace you don’t want to destructure that but instead refer to e.g. `asNamespace("stats")`, etc. – Konrad Rudolph Mar 29 '23 at 11:30
  • 1
    Yes we have special casing for namespaces, .GlobalEnv etc. We even have options to place it below the right parent etc, but it gets messy/verbose quickly, if a list contains twice the same environment for instance we might not want to define it twice, but then we can't have one neat call and we get further from the dput() equivalence. You can find some more on this here: https://github.com/cynkra/constructive/blob/main/R/environment.R . Having an answer to this question would help me have a simple output that always works (in a given session). – moodymudskipper Mar 29 '23 at 11:55
  • @moodymudskipper I see, that's good. For recursively nested environments (or just repeated environments), have a look at the `refhook` argument of [`serialize`/`deserialize`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/serialize.html). Maybe you could reuse the idea for this package. For instance, you could have a function `constructive::ref_env("x")` which constructs an environment based on the symbolic name `x`, and your `constructive::construct` call could define these environment references out-of-line. Of course having a simple way based on pointer address would still help. – Konrad Rudolph Mar 29 '23 at 12:11

1 Answers1

7

You'll want to do a bit more work to make this safe and portable, if a safe and portable version exists. Note that the integer type uintptr_t and the corresponding macro format specifier SCNxPTR are optional in C99. See the advice in WRE under "Writing portable packages".

/* objectFromAddress.c */

#include <inttypes.h> /* uintptr_t, SCNxPTR */
#include <stdio.h> /* sscanf */
#include <Rinternals.h> /* SEXP, etc. */

SEXP objectFromAddress(SEXP a) {
    uintptr_t p = 0;
    
    if (TYPEOF(a) != STRSXP || XLENGTH(a) != 1 ||
        (a = STRING_ELT(a, 0)) == NA_STRING ||
        sscanf(CHAR(a), "%" SCNxPTR, &p) != 1)
        error("'a' is not a formatted unsigned hexadecimal integer");
    
    return (SEXP) p;
}
tools::Rcmd(c("SHLIB", "objectFromAddress.c"))
using C compiler: ‘Apple clang version 13.0.0 (clang-1300.0.29.30)’
using SDK: ‘MacOSX12.1.sdk’
clang -I"/usr/local/lib/R/include" -DNDEBUG   -I/opt/R/arm64/include -I/usr/local/include    -fPIC  -Wall -g -O2 -pedantic -mmacosx-version-min=11.0 -arch arm64 -falign-functions=64 -Wno-error=implicit-function-declaration -flto=thin -c objectFromAddress.c -o objectFromAddress.o
clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -Wall -g -O2 -pedantic -mmacosx-version-min=11.0 -arch arm64 -falign-functions=64 -Wno-error=implicit-function-declaration -flto=thin -fPIC -Wl,-mllvm,-threads=4 -L/usr/local/lib/R/lib -L/opt/R/arm64/lib -L/usr/local/lib -o objectFromAddress.so objectFromAddress.o -L/usr/local/lib/R/lib -lR -Wl,-framework -Wl,CoreFoundation
dyn.load("objectFromAddress.so")
(e <- new.env())
<environment: 0x112811b30>
identical(.Call("objectFromAddress", "112811b30"), e)
[1] TRUE

Questions you'll want to ask:

  • What happens if the memory at the address is freed by the garbage collector before the call to objectFromAddress?
  • What happens if the supplied address is not the address of a SEXPREC?

In practice, you (the maintainer) would need to guarantee from R that this function is not called in either of those cases.

Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48