3

I have a base class message

message Animal {
     optional string name = 1;
     optional int32 age = 2;  
}

and the sub-class which extends animal

message Dog{
     optional string breed = 1;
}

So while building a dog message , i should be able to set all the fields of Animal. I know the round about way of doing it (declaring all the animal fields once again in dog message)but is it possible simply and effectively using protobuffers? Also i learnt about extensions and i understood that it is just used to add a new field to the already existing message and so it should not be misconstrued to be the possible solution for achieving inheritance.

Is it possible to achieve the above simple design using protobuffers's extensions?

Aarish Ramesh
  • 6,745
  • 15
  • 60
  • 105

1 Answers1

3

There are a few different ways to accomplish "inheritance" in Protocol Buffers. Which one you want depends on your use case.

Option 1: Subclass contains superclass

message Animal {
  optional string name = 1;
  optional int32 age = 2;  
}

message Dog {
  required Animal animal = 1;
  optional string breed = 2;
}

Here, Dog contains an Animal, thus contains all the information of Animal.

This approach works if you do not need to support down-casting. That is, you never have to say "Is this Animal a Dog?" So, anything which might need to access the fields of Dog needs to take a Dog as its input, not an Animal. For many use cases, this is fine.

Option 2: Superclass contains all subclasses

message Animal {
  optional string name = 1;
  optional int32 age = 2;

  // Exactly one of these should be filled in, depending on the species.
  optional Dog dog = 100;
  optional Cat cat = 101;
  optional Axolotl axolotl = 102;
  // ...
}

In this approach, given an Animal, you can figure out which animal it is and access the species-specific information. That is, you can down-cast.

This works well if you have a fixed list of "subclasses". Just list all of them, and document that only one of the fields should be filled in. If there are a lot of subclasses, you might want to add an enum field to indicate which one is present, so that you don't have to individually check has_dog(), has_cat(), has_mouse(), ...

Option 3: Extensions

message Animal {
  optional string name = 1;
  optional int32 age = 2;
  extensions 100 to max;  // Should contain exactly one "species" extension.
}

message Dog {
  optional string breed = 1;
}

extend Animal {
  optional Dog animal_dog = 100;
  // (The number must be unique among all Animal extensions.)
}

This option is actually semantically identical to option #2! The only difference is that instead of declaring lots of optional fields inside Animal, you are declaring them as extensions. Each extension effectively adds a new field to Animal, but you can declare them in other files, so you don't have to have one central list, and other people can add new extensions without editing your code. Since each extension behaves just like a regular field, other than having somewhat-weird syntax for declaring and accessing it, everything behaves the same as with option #2. (In fact, in the example here, the wire encoding would even be the same, since we used 100 as the extension number, and in option 2 we used 100 as the field number.)

This is the trick to understanding extensions. Lots of people get confused because they try to equate "extend" to inheritance in object-oriented languages. Don't do that! Just remember that extensions behave just like fields, and that options 2 and 3 here are effectively the same. It's not inheritance... but it can solve the same problems.

Kenton Varda
  • 41,353
  • 8
  • 121
  • 105
  • It is probably just saying the same thing as your existing explanation, but it is also worth nothing that option 2 (/3) can satisfy the LSP, and can cope with new subclasses appearing between versions - I suspect both of these are just direct results of "can support downcasting", though. – Marc Gravell Jan 31 '14 at 08:09
  • @Kenton: Downcasting ability is what i exactly want so option 2 would do..thanks – Aarish Ramesh Jan 31 '14 at 09:03
  • @MarcGravell: I can't get you. what is LSP? – Aarish Ramesh Jan 31 '14 at 09:05
  • @aarish http://en.wikipedia.org/wiki/Liskov_substitution_principle; basically, where the code expects an `Animal`, you can give it a `Dog` and it will work; in this case "giving it a Dog" means "giving it an Animal that has a non-null Dog component", unless your protobuf implementation has actual inheritance mapping using the implementation Kenton describes (and I know of at least one that does ;p) – Marc Gravell Jan 31 '14 at 09:10
  • @MarcGravell: got it:) – Aarish Ramesh Jan 31 '14 at 09:40
  • I would argue that all the options support LSP. In option 1, you simply have to add `.animal()` when you want to use a `Dog` as an `Animal`, but that's just syntax, not something that would affect the design or layout of your code. – Kenton Varda Jan 31 '14 at 23:15
  • Though downcasting can be achieved in any of the above ways specified, is by any chance up-casting possible? please answer this question http://stackoverflow.com/questions/21496803/upcasting-the-downcasted-object-back-in-protobuffers – Aarish Ramesh Feb 01 '14 at 10:21
  • @KentonVarda: Then regarding the second option, do u mean to check for the existence of dog in animal by checking the presence of its field number/name in animal ie downcasting by checking using Animal's field descriptor or any other way of downcasting?? – Aarish Ramesh Feb 01 '14 at 10:27
  • 1
    @aarish - With option 2, to check if an `Animal` is a `Dog`, simply call `animal.hasDog()`. For option 3, you would have to say `animal.hasExtension(animalDog)`. You don't need to use descriptors. – Kenton Varda Feb 02 '14 at 05:29