.proto
file. This allows other .proto
files to add to your message definition by defining the types of some or all of the fields with those field numbers.the problem is that map
is a pointer so the []
operator does not work.
Thus, the pointer needs to be dereferenced first. *map[key]
also does not work, as the compiler first parses []
and then the *
. The following does work:
(*map)[key] = val;
you could do: auto& map = *test.mutable_map1();
, and then map[key]
would work
Protobuf has many advantages for serialization that go beyond the capacity of XML. It allows you to create a simpler description than using XML. Even for small messages, when requiring multiple nested messages, reading XML starts to get difficult for human eyes.
Another advantage is the size, as the Protobuf format is simplified, the files can reach 10 times smaller compared to XML. But the great benefit is its speed, which can reach 100 times faster than the standard XML serialization, all due to its optimized mechanism. In addition to size and speed, Protobuf has a compiler capable of processing a .proto file to generate multiple supported languages, unlike the traditional method where it is necessary to arrange the same structure in multiple source files.
bool ParseFromString(const string& data);
: parses a message from the given string.bool SerializeToOstream(ostream* output) const;
: writes the message to the given C++ostream
.bool ParseFromIstream(istream* input);
: parses a message from the given C++istream
.
- use
uint32
if the value cannot be negative - use
sint32
if the value is pretty much as likely to be negative as not (for some fuzzy definition of "as likely to be") - use
int32
if the value could be negative, but that's much less likely than the value being positive (for example, if the application sometimes uses -1 to indicate an error or 'unknown' value and this is a relatively uncommon situation)
As you saw in the previous section, all the protocol buffer types associated with wire type 0 are encoded as varints. However, there is an important difference between the signed int types (sint32
and sint64
) and the "standard" int types (int32
and int64
) when it comes to encoding negative numbers. If you use int32
or int64
as the type for a negative number, the resulting varint is always ten bytes long – it is, effectively, treated like a very large unsigned integer. If you use one of the signed types, the resulting varint uses ZigZag encoding, which is much more efficient.
ZigZag encoding maps signed integers to unsigned integers so that numbers with a small absolute value (for instance, -1) have a small varint encoded value too. It does this in a way that "zig-zags" back and forth through the positive and negative integers, so that -1 is encoded as 1, 1 is encoded as 2, -2 is encoded as 3, and so on
上述三个字节实际分为两部分: 08 96 01。第一部分(08)包含了message成员变量的field number(a=1)和变量类型(Varint),第二部分(96 01)为a的实际值150。
这里面涉及几个概念:
Varint:这个可以理解为可变长的int类型,数值越小使用的byte越少;
field number和type:protocol buffer消息为一系列的key-value对。二进制版本的消息使用field number作为key。
message流中的key类型为varint,计算方式为:(field_number << 3) | wire_type
,即后三位保存了通信类型
上述第一个字节为08,转化为二进制为0000 1000,没个varint的第一个比特位为MSB位,置位表示后续还有字节。去掉MSB位后为
000 1000
后三位表示类型,值为0,表示类型为Varint;右移三位获取tag值为1(即message中设置的a = 1)
下面获取消息值150,注意:字节顺序为大端序
96 01 = 1001 0110 0000 0001 → 000 0001 ++ 001 0110 (drop the msb and reverse the groups of 7 bits) → 10010110 → 128 + 16 + 4 + 2 = 150
Builder
from a Descriptor
. A Descriptor
has no type information as to the proto (or builder) class that it need to create, because all Descriptor
instances are of the same class (it's final).Descriptor
, your if/else
is roughly as good as you can get. (I say roughly because you could do it with a map or a switch instead; but it's basically the same).Message prototype = Foo.getDefaultInstance(); // Or Bar.getDefaultInstance().
Message
you can get both a builder and the descriptor:Message.Builder builder = prototype.newBuilderForType();
Descriptor descriptor = prototype.getDescriptorForType();
http://giorgio.azzinna.ro/2017/07/extending-protobuf-dynamic-messages/
Message is an abstract interface, but whenever you call protoc the generated classes will subclass it, hence the frequent indirect usage.
Descriptor
Descriptor, as the name suggests, describes messages.
Again, think of protoc: once it effectively parses the .proto files, it will create a Descriptor for each message.
With this in mind, it should be clear when we need Descriptors or Messages. When dealing with actual objects filled with data, Message can be used (hand in hand with reflection).
When message definitions are unknown at compile-time, and should be generated at run-time, Descriptor does the job.
https://codeburst.io/using-dynamic-messages-in-protocol-buffers-in-scala-9fda4f0efcb3
https://pinkiepractices.com/posts/protobuf-field-masks/
FieldMaskUtil
.merge
method will apply a field mask to a message for us. merge
takes in a field mask, a source message, and a destination message builder. It sets fields in the destination builder, according to the field mask and source.public FetchItemResponse fetchItem(FetchItemRequest request) {
Item item = // fetch item as before
Item filteredItem = Item.newBuilder();
FieldMaskUtil.merge(request.getFieldMask(), item, filteredItem);
return FetchItemResponse.newBuilder()
.setItem(filteredItem)
.build();
}
https://developers.google.com/protocol-buffers/docs/proto3#scalar
https://developers.google.com/protocol-buffers/docs/proto
import "myproject/other_protos.proto";
- the field numbers for any existing fields.
- Any new fields that you add should be
optional
orrepeated
. This means that any messages serialized by code using your "old" message format can be parsed by your new generated code, as they won't be missing anyrequired
elements. You should set up sensible default values for these elements so that new code can properly interact with messages generated by old code. Similarly, messages created by your new code can be parsed by your old code: old binaries simply ignore the new field when parsing. However, the unknown fields are not discarded, and if the message is later serialized, the unknown fields are serialized along with it – so if the message is passed on to new code, the new fields are still available. - Non-required fields can be removed, as long as the field number is not used again in your updated message type. You may want to rename the field instead, perhaps adding the prefix "OBSOLETE_", or make the field number reserved, so that future users of your
.proto
can't accidentally reuse the number. - A non-required field can be converted to an extension and vice versa, as long as the type and number stay the same.
optional
is compatible withrepeated
. Given serialized data of a repeated field as input, clients that expect this field to beoptional
will take the last input value if it's a primitive type field or merge all input elements if it's a message type field.
repeated
fields of scalar numeric types aren't encoded as efficiently as they could be. New code should use the special option [packed=true]
to get a more efficient encoding. For example:repeated int32 samples = 4 [packed=true];
.proto
, including data corruption, privacy bugs, and so on. One way to make sure this doesn't happen is to specify that the field numbers (and/or names, which can also cause issues for JSON serialization) of your deleted fields are reserved
. The protocol buffer compiler will complain if any future users try to use these field identifiers.message Foo { reserved 2, 15, 9 to 11; reserved "foo", "bar"; }
Oneof
https://blog.bazel.build/2017/02/27/protocol-buffers.html
proto_library
is a language-agnostic rule that describes relations between .proto
files.java_proto_library
, java_lite_proto_library
and cc_proto_library
are rules that "attach" to proto_library
and generate language-specific bindings.