This answer draws together all the information I've learnt related to the suite of questions I asked, for the benefit of anyone else who comes across a similar issue.
It turns out to be easier to present the answers in reverse to the order in which the questions were asked.
The aspect which really resolved my specific problem was provided by @Edzer Pebesma - who has my thanks.
John Chambers answered a more specific follow on question I asked on the r-devel mailing list, and that really helped fill in the gaps in my understanding of why the warning message was being given - I am also very grateful for his guidance and assistance.
Any errors that remain in the explanation below are entirely my own!
The test4 version of the toy package referenced in this question and answer used to be found in --> folder SpatialLinesNULL of branch SpatialLinesNULL in a github repo at http://github.com/Geoff99/Examples/tree/SpatialLinesNULL <-- Other versions of the test sequence can be constructed by deleting a line or two from the NAMESPACE and DESCRIPTION files.
Update 6 May 2017 - test4 version code moved to a gist at https://gist.github.com/Geoff99/29be25bce4cd4c918921bf68769c6a39
Question 3 (after test4)
Is the warning from test3 and this subsequent error something I can resolve myself (if so how)? or do I need to ask the maintainer of the rgeos package for help (eg ask them to export SpatialLinesNULL)?
The warning from test3 was
class "SpatialLinesNULL" is defined (with package slot ‘rgeos’) but no metadata object found to revise subclass information---not exported? Making a copy in package ‘minweSpatialNULL
and the error message from test4 was
d> devtools::document()
Updating minweSpatialNULL documentation
Loading minweSpatialNULL
Error: class "SpatialLinesNULL" is not exported by 'namespace:rgeos
The best solution to this was to approach a maintainer of the rgeos
package and ask them to export the missing SpatialLinesNULL
class that is mentioned in the warning and error messages. Edzer Pebesma answered this for me, and also made the necessary updates to the rgeos
package, for which I am very grateful. After installing the upgraded version of package rgeos
I could simply add importClassesFrom(rgeos,SpatialLinesNULL)
to the NAMESPACE file of my toy package, and the warning went away.
If you are facing a similar situation with a different package, that is the strategy I would recommend.
I suspect that there is a way I could have resolved this myself (essentially by manually making a copy of the missing metadata about the SpatialLinesNULL class
in the namespace environment of my package, which is basically what the methods
package does itself after it issues the warning) --- but it is such a messy hack that I will refrain from explaining how. If you really need to do this at some stage in the future, the answer to question 1 (below) collects the information I needed to understand to figure out how to do the hack.
Question 2 (after test3)
I am presuming that whatever the reason is for my first warning message, is also the reason for this similar message.
Am I on the right track?
The warning from test3 was
class "SpatialLinesNULL" is defined (with package slot ‘rgeos’) but no metadata object found to revise subclass information---not exported? Making a copy in package ‘minweSpatialNULL
and the first warning from test1 was
class "Spatial" is defined (with package slot ‘sp’) but no metadata object found to revise subclass information---not exported? Making a copy in package ‘minweSpatialNULL’
The answer was yes, both messages appear for the same reason.
The warning messages are both being generated by a function called .findOrCopyClass
with is an internal (ie non-exported) function in the methods package. To see the code for this function it is necessary to use the :::
operator. Type methods:::.findOrCopyClass
into an R console. (This requires the methods package to be loaded, which it almost always is). .findOrCopyClass
issues this warning when it cannot find the metadata which defines a class, and so it has to copy the metadata.
See below for what the metadata is, why it needs to be included in the namespace for my toy package, and (my best guess) at why this warning is issued.
.findOrCopyClass
is called by setIs
, which is an exported function from the methods
package. In turn setIs
is called (several times) by the methods function setClass
, which I call in my toy package example. Type ?setIs
and ?setClass
for more information about those functions. Or just type setIs
or setClass
in an R console if you want to see their source code.
Question 1 (after test1)
Why does R need the ~~(virtual?)~~ superclass Spatial anyway~~, and why only when I have the 2nd generation MyClass2 defined in my code~~?
Clarification added : I had thought that the R packaging namespace mechanism would have taken care of any need to find and access antecedents / superclasses once I had @importClassesFrom
ed the class I actually use.
Apologies and corrections first.
The part of the original question that asked:
and why only when I have the 2nd generation MyClass2 defined in my code?
was just plain wrong. Despite my checking before I posted the question, I must have unwittingly still has some hidden information cached in my globalenv()
. The underlying issue exists even if I only have MyClass1
defined in my toy package.
Spatial
is not a virtual class. SpatialLinesNULL
is a virtual class. Whether or not a superclass is virtual has nothing to do with this issue.
- The issue does have a lot to do with
- Where a superclass is defined (in my toy example,
Spatial
comes from sp
and SpatialLinesNULL
comes from rgeos
)
- whether the superclass definition is exported from that package (in the toy example,
sp
and rgeos
)
- and hence whether and how R can find the superclass definitions when it is checking and installing a new package (eg
minweSpatialNuULL
) which uses a class (eg SpatialLines
) whose parents include superclasses (eg Spatial
, SpatialLinesNULL
).
What the missing metadata warning messages are about
The following is an annotated walk through parts of the documentation for the methods
package. It took me a while to grasp, but that is to do with my low knowledge base when I began. The documentation is very useful, and if you get this far, I would recommend reading it. I found the entries from ?setClass
, ?Classes
, ?getClassDef
, ?classMetaName
and ?setIs
particularly valuable.
What
setClass
does
From the ?setClass
documentation (with emphasis added by me) :
Create a class definition, specifying the representation (the slots) and/or the classes contained in this one (the superclasses), plus other optional details. As a side effect, the class definition is stored in the specified environment. A generator function is returned as the value of setClass(), suitable for creating objects from the class if the class is not virtual.
The warning message issued by methods:::.findOrCopyClass
is related to the creation and storage of the class definition. The class definition is (or is stored in?) a metadata object.
What a
metadata object is
From the ?Classes
documentation :
Class definitions are objects that contain the formal definition of a class of R objects, ... and
When a class is defined, an object is stored that contains the information about that class. The object, known as the metadata defining the class, is not stored under the name of the class (to allow programmers to write generating functions of that name), but under a specially constructed name. To examine the class definition, call getClass
. The information in the metadata object includes:
- Slots ...
- SuperClasses ...
- The information about the relation between a class and a particular superclass is encoded as an object of class SClassExtension. A list of such objects for the superclasses (and sometimes for the subclasses) is included in the metadata object defining the class
- Prototype ...
The methods
package is all about creating and managing these metadata objects (for S4 classes, and as best as it can, for the older S3 classes as well).
The metadata object for a new class (eg MyClass1
) includes information about any slots in its definition (since MyClass1
specifies contains = c('SpatialLines')
it inherits all the slots from the SpatialLines
class, defined in package sp
).
Importantly, the metadata object for a class must also contain information about any superclasses - otherwise how could the new class ever inherit anything from its parents, ie the superclasses ? And one way and another, that information about superclasses must reach all the way up the inheritance tree. In the example, MyClass1
has a distance one superclass SpatialLines
, and hence inherits all the more distant ancestors of SpatialLines
, whoever they may be.
- I think I read somewhere, but have mislaid the link, that the metadata object can be in two states - incomplete or complete. In the incomplete state, the list of superclasses for a new class (eg
MyClass1
) may not (yet) have been filled in all the way up the inheritance chain, whereas by the time a class definition is used, R has to have traversed the inheritance chain, and filled in the complete genealogy for the class. When a package is checked or installed, the inheritance chain must be complete - if it has not been completed before, it has to be during the install.packages
step.
How to look at a metadata object
Use getClass
or getClassDef
to see the metadata object for a given class.
The documentation ?getClass
or ?getClassDef
explains which environments R searches for metadata objects. A key thing to note is that a package must at least be loaded before metadata objects from classes defined in the package can be found.
Here are some examples from my toy package (minweSpatialNULL
) after it has been loaded with version 0.3-9 of rgeos
, which does export the SpatialLinesNULL
class. Notice that:
MyClass1
knows all its parent superclasses (the usual situation), and
SpatialLinesNULL
(which is a Class Union defined in the rgeos
package) knows almost all its children subclasses, even though rgeos
itself knows nothing at all about MyClass1
or MyClass2
!
d> getClass('MyClass1')
Class "MyClass1" [package "minweSpatialNULL"]
Slots:
Name: lines bbox proj4string
Class: list matrix CRS
Extends:
Class "SpatialLines", directly
Class "Spatial", by class "SpatialLines", distance 2
Class "SpatialLinesNULL", by class "SpatialLines", distance 2
Known Subclasses: "MyClass2"
d> getClass('SpatialLinesNULL')
Extended class definition ( "ClassUnionRepresentation" )
Virtual Class "SpatialLinesNULL" [package "rgeos"]
No Slots, prototype of class "NULL"
Known Subclasses:
Class "NULL", directly
Class "SpatialLines", directly
Class ".NULL", by class "NULL", distance 2, with explicit coerce
Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2
Class "MyClass2", by class "MyClass1", distance 3
d>
Actually it is slightly curious that SpatialLinesNULL
knows about MyClass2
because it is a subclass of MyClass1
, but does not appear to mention MyClass1
itself directly. I haven't figured out why yet.
Why the list of superclasses can change
I was puzzled for a while as to why the list of superclasses of my toy MyClass1
class changed when I made no change to my package other than adding rgeos
to the Imports
directive in the toy package's DESCRIPTION file.
The reason is that the methods package is very clever. As well as defining subclasses, it is possible to insert parent superclasses into the family tree (inheritance chain) of a class, after that class has been defined.
A simple way is to use setClassUnion
(type ?setClassUnion
for an explanation). That is what rgeos
does when it defines SpatialLinesNULL
- it creates a new parent for SpatialLines
. And that is why as soon as I added the rgeos
entry in my toy package DESCRIPTION file, the class MyClass1
suddenly acquired another superclass as well. In an interactive setting something similar would presumably have happened, depending on whether or not I had loaded (or attached and loaded) the rgeos
package.
There is a more complicated way to add ancestors after the fact as well - see ?setIs
.
The names of metadata objects
As with all other objects, R finds metadata objects via their name. If an object is defined in a package, the name of the object lives in
- the namespace environment of the package (eg
environment: namespace:minweSpatialNULL
), or
- the imports environment of the package (eg
environment: imports:minweSpatialNULL
).
From the ?Classes
documentation again :
When a class is defined, an object is stored that contains the information about that class. The object, known as the metadata defining the class, is not stored under the name of the class (to allow programmers to write generating functions of that name), but under a specially constructed name.
You can find the specially constructed name using the classMetaName
function which the methods package provides - see ?methods::classMetaName
for details.
An example of the name of a metadata object, taken from my toy package, is :
d> classMetaName('MyClass1')
[1] ".__C__MyClass1"
Because of the leading .
the name of this metadata object (ie .__C__MyClass1
) is usually hidden when you ls
the namespace environment of the package - but you can see it by using the all.names = TRUE
argument to ls
.
d> # Recall that MyClass1 is the name I chose for the generator function
d> # returned by setClass when I defined the class MyClass1
d> env_toy_package <- environment(MyClass1)
d> ls(env_toy_package, all.names=TRUE)
[1] ".__C__MyClass1" ".__C__MyClass2" ".__DEVTOOLS__"
[4] ".__NAMESPACE__." ".__S3MethodsTable__." ".packageName"
[7] "MyClass1" "MyClass2"
d>
And if you look in the parent.env
of the package namespace environment (ie the imports:minweSpatialNULL
namespace environment), you find the hidden names of the superclasses of MyClass1
, which is what .findOrCopyClass
was looking for so they could be put there!.
d> parent.env(env_toy_package)
<environment: 0x0000000008df59d8>
attr(,"name")
[1] "imports:minweSpatialNULL"
d> ls(parent.env(env_toy_package), all.names = TRUE)
[1] ".__C__Spatial" ".__C__SpatialLines" ".__C__SpatialLinesNULL"
[4] "library.dynam.unload" "SpatialLines" "system.file"
d>
Finally, what the '.findOrCopyClass` warning is about
When .findOrCopyClass
issues a warning such as :
class "Spatial" is defined (with package slot ‘sp’) but no metadata object found to revise subclass information---not exported? Making a copy in package ‘minweSpatialNULL’
the function is letting me know that it has been unable to find the hidden name .__C__Spatial
in the namespaces it has searched and revise the metadata in the object to which that name is bound. It knows where the hidden name should be --- it must come from package sp
, because the metadata object from the SpatialLines
class itself says where its parent superclass (Spatial
) lives. The full specification of an ancestor superclass in a metadata object includes a package
attribute containing the name of the package where the ancestor superclass was defined, just in case two existing loaded packages just happen to have chosen the same name for a class!
That is why importClassesFrom
ing the missing superclass works - it brings the missing hidden names into the imports:minweSpatialNULL
environment of the minweSpatialNULL
toy package, from which spot .findOrCopyClass
can find them, and update them if necessary.
Why
.findOrCopy
emits a warning
I wondered why .findOrCopyClass
told me it couldn't find the metadata object, when the very next thing it said was that it was making a copy in my packages namespace! I posed that question on the r-devel mailing list, and John Chambers kindly answered it for me (emphases added by me):
the purpose of finding the class definition is to update the entry for the new relationship, as the warning message suggests. That requires that the namespace holding the definition be writable.
In the case of subclass information, the original namespace is very likely to be locked, if it's not the package currently being loaded. Copying the definition in order to update subclass information seems the only reasonable choice, and no warning message is needed.
A revised version will omit this message.
My remaining conceptual problem related to the difference between what happens to the name of an object (eg a function or a class (to be precise, a function object or a class metadata object)) and what happens to the object itself. Or in other terms, whether a namespace (possibly locked) needs to be updated, or the (state of the) object itself is what needs to be updated.
It turns out that 'findorCopyClasses
is not complaining that it cannot find the (non-exported) superclass, but rather that it cannot find it in a place where the superclass's value can be changed.
What locked means
The ?base::bindenv
documentation says :
" The namespace environments of packages with namespaces are locked
when loaded."
and a locked environment means
" prevents adding or removing variable bindings from the environment.
Changing the value of a variable is still possible unless the binding has been locked"
But a bit experimentation reveals that as well as package namespace
environments being locked, the bindings in the imports:namespace are locked. Locked bindings means that :
" The value of a locked binding cannot be changed".
Hence since the superclass metadata object comes from another package (eg rgeos
), its binding in that other package's namespace is locked and hence its value cannot be changed. Or in terms of the example, when the methods package wants to add Myclass1
and MyClass2
to the list of subclasses that SpatialLinesNULL
'owns', it finds it cannot because the binding of SpatialLinesNULL
(or to be precise, the binding of the hidden name .__C__SpatialLinesNULL
) is locked. Hence the message about needing to make a copy!
An alternative to copying might be to temporarily unlock the binding in the namespace of the otherpackage, update the object, and relock the binding - but since I have only just learnt about locking environments and bindings, I have no idea what the consequences of that might be. I'll leave well enough alone.
OK, How to
hack a solution if the required superclass is not exported
Caveat I haven't tested this much, and it's a bad idea anyway, but if you are really really desperate to get the warning message to go away ...
Step1. Look at the warning message and find the (text) name of the missing superclass, and the package it comes from.
Step2. Use classMetaName
to find the hidden and mangled name of the missing unexported superclass. For superclass MissingSuperclass
it will probably be .__C__MissingSuperclass
Step3. Use the :::
approach to access the metadata object (ie donorpackage:::.__C__MissingSuperclass
) and give it the appropriate name (ie .__C__MissingSuperclass
) in the appropriate environment belonging to your package, before calling setClass
to create your own class.
As I've already said, this is a really bad idea, since it would be enormously easy to shoot yourself in the foot, and why bother anyway, since that's essentially what .findOrCopyClass
and setIs
seem to do after the warning message has been sent anyway.
The end
If you've read this far, I hope this has helped you! I've written it at such length mostly as a tutorial for a future me :-)