14

In Google's Protocol Buffer API for Java, they use these nice Builders that create an object (see here):

Person john =
  Person.newBuilder()
    .setId(1234)
    .setName("John Doe")
    .setEmail("jdoe@example.com")
    .addPhone(
      Person.PhoneNumber.newBuilder()
        .setNumber("555-4321")
        .setType(Person.PhoneType.HOME))
    .build();

But the corresponding C++ API does not use such Builders (see here)

The C++ and the Java API are supposed to be doing the same thing, so I'm wondering why they didn't use builders in C++ as well. Are there language reasons behind that, i.e. it's not idiomatic or it's frowned upon in C++? Or probably just the personal preference of the person who wrote the C++ version of Protocol Buffers?

Frank
  • 64,140
  • 93
  • 237
  • 324
  • 2
    I think it's likely the personal preference of the C++ implementer. Builders are not (in my experience, at least) frowned upon in C++ code, and in fact, I use them all over the place where an object may have a) many parameters or (more likely) b) many optional parameters. – moswald Feb 26 '10 at 16:33
  • one thing you failed to note in your question is that the Person class is immutable. – BD at Rivenhill Feb 04 '11 at 16:51

6 Answers6

8

The proper way to implement something like that in C++ would use setters that return a reference to *this.

class Person {
  std::string name;
public:
  Person &setName(string const &s) { name = s; return *this; }
  Person &addPhone(PhoneNumber const &n);
};

The class could be used like this, assuming similarly defined PhoneNumber:

Person p = Person()
  .setName("foo")
  .addPhone(PhoneNumber()
    .setNumber("123-4567"));

If a separate builder class is wanted, then that can be done too. Such builders should be allocated in stack, of course.

hrnt
  • 9,882
  • 2
  • 31
  • 38
  • 1
    Note that this requires a default-constructed `Person`. If each `Person` needs an `id`, no such ctor may exist. A builder can solve the problem by collecting the arguments before creating the object. – MSalters Feb 19 '10 at 11:51
  • @MSalters Indeed, in those cases you should use the same idiom with the builder class (and the .build() member function that returns the Person object could check the validity of the object before construction). – hrnt Feb 19 '10 at 19:02
  • Your answer is missing one major point that the OP forgot to mention: the Java code is using the builder pattern here because the Person class is defined to be immutable and thus has no setter methods. – BD at Rivenhill Feb 04 '11 at 16:50
  • 1
    That defeats the whole purpose of the pattern: we don't want to have setters, that's why we do a static builder class in Java. – Rob Feb 01 '13 at 05:42
  • @Rob: The good thing about c++ is that - instead of creating an immutable class - you can make the variable p immutable by declaring it const, which effectively removes all setter methods for that specific object. However, I'd probably still go for a explicit builder class, in order to have a single function, where one can check the validity of a certain parameter combination. – MikeMB May 24 '15 at 01:14
  • Ya understand const semantics..I agree with your conclusion. Pretty buildrr-obsessed. – Rob May 24 '15 at 01:16
4

I would go with the "not idiomatic", although I have seen examples of such fluent-interface styles in C++ code.

It may be because there are a number of ways to tackle the same underlying problem. Usually, the problem being solved here is that of named arguments (or rather their lack of). An arguably more C++-like solution to this problem might be Boost's Parameter library.

philsquared
  • 22,403
  • 12
  • 69
  • 98
2

The difference is partially idiomatic, but is also the result of the C++ library being more heavily optimized.

One thing you failed to note in your question is that the Java classes emitted by protoc are immutable and thus must have constructors with (potentially) very long argument lists and no setter methods. The immutable pattern is used commonly in Java to avoid complexity related to multi-threading (at the expense of performance) and the builder pattern is used to avoid the pain of squinting at large constructor invocations and needing to have all the values available at the same point in the code.

The C++ classes emitted by protoc are not immutable and are designed so that the objects can be reused over multiple message receptions (see the "Optimization Tips" section on the C++ Basics Page); they are thus harder and more dangerous to use, but more efficient.

It is certainly the case that the two implementations could have been written in the same style, but the developers seemed to feel that ease of use was more important for Java and performance was more important for C++, perhaps mirroring the usage patterns for these languages at Google.

BD at Rivenhill
  • 12,395
  • 10
  • 46
  • 49
1

Your claim that "the C++ and the Java API are supposed to be doing the same thing" is unfounded. They're not documented to do the same things. Each output language can create a different interpretation of the structure described in the .proto file. The advantage of that is that what you get in each language is idiomatic for that language. It minimizes the feeling that you're, say, "writing Java in C++." That would definitely be how I'd feel if there were a separate builder class for each message class.

For an integer field foo, the C++ output from protoc will include a method void set_foo(int32 value) in the class for the given message.

The Java output will instead generate two classes. One directly represents the message, but only has getters for the field. The other class is the builder class and only has setters for the field.

The Python output is different still. The class generated will include a field that you can manipulate directly. I expect the plug-ins for C, Haskell, and Ruby are also quite different. As long as they can all represent a structure that can be translated to equivalent bits on the wire, they're done their jobs. Remember these are "protocol buffers," not "API buffers."

The source for the C++ plug-in is provided with the protoc distribution. If you want to change the return type for the set_foo function, you're welcome to do so. I normally avoid responses that amount to, "It's open source, so anyone can modify it" because it's not usually helpful to recommend that someone learn an entirely new project well enough to make major changes just to solve a problem. However, I don't expect it would be very hard in this case. The hardest part would be finding the section of code that generates setters for fields. Once you find that, making the change you need will probably be straightforward. Change the return type, and add a return *this statement to the end of the generated code. You should then be able to write code in the style given in Hrnt's answer.

Community
  • 1
  • 1
Rob Kennedy
  • 161,384
  • 21
  • 275
  • 467
1

To follow up on my comment...

struct Person
{
   int id;
   std::string name;

   struct Builder
   {
      int id;
      std::string name;
      Builder &setId(int id_)
      {
         id = id_;
         return *this;
      }
      Builder &setName(std::string name_)
      {
         name = name_;
         return *this;
      }
   };

   static Builder build(/* insert mandatory values here */)
   {
      return Builder(/* and then use mandatory values here */)/* or here: .setId(val) */;
   }

   Person(const Builder &builder)
      : id(builder.id), name(builder.name)
   {
   }
};

void Foo()
{
   Person p = Person::build().setId(2).setName("Derek Jeter");
}

This ends up getting compiled into roughly the same assembler as the equivalent code:

struct Person
{
   int id;
   std::string name;
};

Person p;
p.id = 2;
p.name = "Derek Jeter";
moswald
  • 11,491
  • 7
  • 52
  • 78
0

In C++ you have to explicitly manage memory, which would probably make the idiom more painful to use - either build() has to call the destructor for the builder, or else you have to keep it around to delete it after constructing the Person object. Either is a little scary to me.

Douglas Leeder
  • 52,368
  • 9
  • 94
  • 137
  • 6
    Couldn't you get around this by keeping everything on the stack? – cobbal Feb 19 '10 at 07:51
  • 4
    or using smart pointers (which amounts to the same thing, in a way) – philsquared Feb 19 '10 at 08:11
  • 6
    Just not true - temporary objects in C++ are trivial. They are destroyed at the end of the full expression, which is after the build. And with templates, creating such builders would be trivial, as you can create a generic one - doesn't need specialization. Ie. `Person = Builder(). (&Person::id, 1234).(&Person::Name, "John Doe");` – MSalters Feb 19 '10 at 11:49
  • @MSalters boy that syntax is ugly, but I like your point about the temp objects. – Rob Feb 01 '13 at 05:43