3

So, let's say I have a class with a contravariant type parameter:

  trait Storage[-T] {
    def list(path: String): Seq[String]
    def getObject[R <: T](path: String): Future[R]
  }

The idea of the type parameter is to constrain the implementation to the upper boundary of types that it can return. So, Storage[Any] can read anything, while Storage[avro.SpecificRecord] can read avro records, but not other classes:

def storage: Storage[avro.SpecificRecord] 
storage.getObject[MyAvroDoc]("foo") // works
storage.getObject[String]("bar") // fails

Now, I have a utility class that can be used to iterate through objects in a given location:

class StorageIterator[+T](
  val storage: Storage[_ >: T], 
  location: String
)(filter: String => Boolean) extends AbstractIterator[Future[T]] {
  val it = storage.list(location).filter(filter)
  def hasNext = it.hasNext
  def next = storage.getObject[T](it.next)
}

This works, but sometimes I need to access the underlying storage from the iterator downstream, to read another type of object from an aux location:

def storage: Storage[avro.SpecificRecord]
val iter = new StorageIterator[MyAvroDoc]("foo")
iter.storage.getObject[AuxAvroDoc](aux)

This does not work, of course, because storage type parameter is a wildcard, and there is no proof that it can be used to read AuxAvroDoc

I try to fix it like this:

class StorageIterator2[P, +T <: P](storage: Storage[P])
  extends StorageIterator[T](storage)

This works, but now I have to specify two type params when creating it, and that sucks :( I tried to work around it by adding a method to the Storage itself:

trait Storage[-T] {
  ... 
  def iterate[R <: T](path: String) = 
    new StorageIterator2[T, R](this, path)
}

But this doesn't compile because it puts T into an invariant position :( And if I make P contravariant, then StorageIterator2[-P, +T <: P] fails, because it thinks that P occurs in covariant position in type P of value T.

This last error I don't understand. Why exactly cannot P be contravariant here? If this position is really covariant (why is it?) then why does it allow me to specify an invariant parameter there?

Finally, does anyone have an idea how I can work around this? Basically, the idea is to be able to

  1. Do storage.iterate[MyAvroDoc] without having to give it the upper boundary again, and
  2. Do iterator.storage.getObject[AnotherAvroDoc] without having to cast the storage to prove that it can read this type of object.

Any ideas are appreciated.

Dima
  • 39,570
  • 6
  • 44
  • 70

1 Answers1

1

StorageIterator2[-P, +T <: P] fails because it is nonsensical. If you have a StorageIterator2[Foo, Bar], and Bar <: Foo, then because it is contravariant in the first parameter, it is also a StorageIterator[Nothing, Bar], but Nothing has no subtypes, so it is logically impossible that Bar <: Nothing, yet this is what must be true to have a StorageIterator2. Therefore, StorageIterator2 cannot exist.

The root problem is that Storage should not be contravariant. Think of what a Storage[T] is, in terms of the contract it gives to its users. A Storage[T] is an object that you give paths to, and will output Ts. It makes perfect sense for Storage to be covariant: something that knows how to output Strings, for example, is also outputting Anys, so it makes logical sense that Storage[String] <: Storage[Any]. You say that it should be the other way around, that a Storage[T] should know how to output any subtype of T, but how would that work? What if someone adds a subtype to T after the fact? T can even be final and still have this problem, because of singleton types. That is unnecessarily complicated, and is reflected in your problem. That is, Storage should be

trait Storage[+T] {
  def list(path: String]: Seq[String]
  def get(path: String): T
}

This does not open you up to the example mistake you gave in your question:

val x: Storage[avro.SpecificRecord] = ???
x.get(???): avro.SpecificRecord // ok
x.get(???): String // nope
(x: Storage[String]).get(???) // nope

Now, your issue is that you can't do something like storage.getObject[T] and have the cast be implicit. You can instead a match:

storage.getObject(path) match {
  case value: CorrectType => ...
  case _ => // Oops, something else. Error?
}

a plain asInstanceOf (undesirable), or you can add a helper method to Storage, like the one you had before:

def getCast[U <: T](path: String)(implicit tag: ClassTag[U]): Option[U] = tag.unapply(get(path))
HTNW
  • 27,182
  • 1
  • 32
  • 60
  • "A Storage[T] is an object that you give paths to, and will output Ts." This is NOT true. `T` is NOT the type of the object `Storage` returns, but a common supertype of _all_ objects it can possibly return (granted, they are all Ts in a way, but that's a technicality, not a conceptual characteristic). It needs to be contravariant, because `Storage[Any]` can be used like `Stroage[Foo]` . – Dima Aug 18 '17 at 11:10
  • "You say that it should be the other way around, that a Storage[T] should know how to output any subtype of T, but how would that work?" I showed one example of it in the question. For example, if `T` is a base class for avro structs, then `Storage` can read any avro doc. It could be a `ThriftStruct` to read thrift, or `Message` for protobuf, a `Product` for csv, etc. It does not make sense to have a separate `Storage` class for every avro doc in the application, and that's what would have to happen if it was covariant. – Dima Aug 18 '17 at 11:12
  • Casting and `match`'ing could be a workaround, but that's such a java way to do things. Avoiding this kind of hacks is exactly the purpose of having type variance in the first place. The idea is to be able to declare at compile time: "this object can read any avro doc and nothing else", not to generate errors at run time, when it runs into something it is not supposed to. – Dima Aug 18 '17 at 11:18
  • One more thing to note: `StorageIterator2[-P, T <: P]` fails, but `StorageIterator[T, -P >: T]` actually WORK, even though it is actually the same thing :) – Dima Aug 18 '17 at 20:06
  • You cannot *implement* a `Storage[-T]` such that it has a method `getObject[R <: T](...): Future[R]`. How would you *create* such a method? It needs to output an `R` for any subtype `R` of `T`, but it doesn't know what `R` is. It needs to read a value, and the only thing it can possibly know is "the thing I just read is a `T`." It can't tell that it's an `R`, and it can't test for it because `R` is an unknown. And it makes *sense* that you cannot create a `Storage[Sup]` and have it be a `Storage[_ <: Sup]`. All I have to do is create a new subclass of `Sup` it doesn't know and it breaks. > – HTNW Aug 19 '17 at 00:10
  • Further, `match`ing is not a workaround: it lets you handle the case where you read something that you needed to be a `Sub1` but it was actually a `Sub2`. With your current design the `Storage` is the one that decides how it fails, but `match`, which is a very Scala/FP thing, shifts the responsibility to the caller, where it should be. – HTNW Aug 19 '17 at 00:12
  • You are mistaken. I certainly CAN implement `Storage[-T]` like that, and I HAVE implemented it exactly that way. It has been working for me for a while :) It knows what `R` is from `: Manifest` (I have just omitted it from signature here, because it is irrelevant). – Dima Aug 19 '17 at 12:26
  • As for `match` being a workaround, what I mean by that is that it gives up the compile time type safety. If my `Storage` only knows how to read strings, and I do `storage.getObject[Int]` I want the compiler to tell me I am doing something wrong, without having to wait till it crashes at run time. – Dima Aug 19 '17 at 12:28