1

There are several questions related to presence field tracking of scalar fields in protobuf 3, but I didn't find any with generic default approach recommendation.

It may be useful to be able to track field presence in protobuf message. I see at least those cases, where it may be useful:

  1. API usage simplicity. In this case we provide some sane default values for missed fields and allow user to set only what's actually needed. This is the same way default function parameters work in many programming languages.
  2. API evolution. If we introduce a new field, which effectively deprecates or replaces an old field, then we can just add a new one to the same message and use the one being set.
  3. Using the same messages for getting/setting entities and for partial updates, aka patches, of the entities.

Now, protobuf 3, unlike protobuf 2, does not support tracking of scalar field presence. In other words, while protobuf 2 had scalar fields with explicit presence, protobuf 3 has no presence by default for such fields. E.g., having double foo = 1; in our message, you won't be able to distinguish whether it's set to zero explicitly, or not set at all. But there are multiple ways to overcome this: optional, oneof, wrappers, FieldMask and maybe others.

So, there may be the following two main approaches in using scalar fields:

  1. Use normal scalar fields with no presence tracking by default, and use fields with explicit presence tracking (e.g. optional) only where needed. Generally it means that if we want all the use cases above to work, we can use no presence tracking if zero is not in a valid domain of a field, and use explicit presence tracking otherwise. The problem with this approach, as I see, may arise if at some time zero is not in a valid domain, but later it becomes a valid value for a field.
  2. Use some approach which allows explicit presence tracking of scalar fields by default. E.g. instead of using double foo = 1; use optional double foo = 1; or google.protobuf.DoubleValue foo = 1; by default.

What approach would you recommend by default? Also, what specific approach (optional, wrappers from google.protobuf, etc) would you recommend for explicit presence tracking of scalar fields? With the introduction of optional in protobuf 3, is there any reason to use other means? And why? Or maybe there is something which I overlooked.

Alex Che
  • 113
  • 4

1 Answers1

1

The design decision around field presence tracking can be quite significant in protocol buffer (protobuf) messages, especially as the API evolves. As you have correctly identified, both approaches, i.e., defaulting to non-presence tracking fields and using presence tracking only where needed, and vice versa, have their own trade-offs.

Here is a brief comparison of various options and their pros/cons:

1. Non-Presence Tracking by Default:

This approach optimizes for compactness in the serialized message and simplicity in the code that handles these messages. However, it assumes that you can define meaningful defaults for all fields and that these defaults won't change. Also, it is important to remember that default values are not serialized in protobuf3, so if your API clients rely on these defaults, they should also be able to handle them. This approach is more suitable for applications where you have more control over your API clients and can ensure they can handle default values correctly.

2. Presence Tracking by Default:

This approach is more flexible because it doesn't require defining defaults for all fields and allows for more precise communication of field values between API clients and the server. It can also facilitate API evolution by allowing the introduction of new fields without breaking existing clients. However, this comes with some costs: increased complexity in message handling code and potentially larger serialized messages. This approach is more suitable for public APIs where you need to minimize the potential for breaking changes.

In terms of specific approaches for presence tracking in protobuf3:

  • Optional Keyword: As of protobuf 3.12, you can use the optional keyword to indicate field presence for scalar fields, effectively restoring a feature that was present in protobuf 2. This option is good if you want to balance between the compactness of non-presence tracking and the flexibility of presence tracking. However, it might introduce some complexity in the code as you have to check for the presence of these fields.

  • Wrapper Types: Wrapper types in protobuf (like google.protobuf.DoubleValue) allow you to treat scalar types as objects, which means they can be null (or not set), effectively providing presence tracking. This option is particularly useful when interfacing with languages that treat primitives and objects differently (like Java). However, it comes at the cost of increased message size and more complex code to handle these types.

  • Oneof: oneof is a powerful feature that allows for field presence tracking, but its primary use is for representing "union" types - cases where only one field out of a set can be set at any given time. Using oneof for presence tracking could be overkill and could introduce unnecessary complexity in your code.

  • FieldMask: A FieldMask is typically used for update operations where you need to specify which fields to update. This is a more advanced feature and might be overkill for simple presence tracking. It's more appropriate when you need to represent a subset of fields to be updated in an API request.

To summarize, the choice between these options depends largely on your use case, the specific requirements of your API, the expected evolution of your API, and the language you are working in. Generally, using the optional keyword or wrapper types would be the most common approach for presence tracking. If your use case is more specific, like updating a subset of fields, then you might want to consider using FieldMask.

Lastly, regardless of the approach you take, it's essential to clearly document your API's behavior, especially around default values and field presence, so that API clients know what to expect.

tomasantunes
  • 183
  • 2