> But Protobuf is not a splitable format, which means it is not Big Data friendly, unlike Parquet, Avro and CSV.
What, when I worked at Google concatenating protobuf strings was a common way to concatenate protobufs, they are absolutely splittable. People might not know it but there is a reason they are designed like they are, it is to handle big data as you say.
If you mean you can't split a single protobuf, sure, but you don't need to do that, it is like trying to split a single csv line, doesn't mean csv isn't splittable.
Edit: Maybe some of that tooling to work with protobufs are internal only or not a part of the external protobuf packages though.
I laughed. We don't call somebody writing a poor varint serializer for in-house use, then discovering that it doesn't handle negative numbers and floats that well so slapping a couple of hotfixes on top of it that make it necessary to have a protocol specification file, "design".
Protobuf does not have built-in delimiters or sync markers between records, which makes it not possible to start reading from an arbitrary point in the middle of a Protobuf-encoded file and correctly interpret the data. That makes Protobuf not a splitable format.
What, when I worked at Google concatenating protobuf strings was a common way to concatenate protobufs, they are absolutely splittable. People might not know it but there is a reason they are designed like they are, it is to handle big data as you say.
If you mean you can't split a single protobuf, sure, but you don't need to do that, it is like trying to split a single csv line, doesn't mean csv isn't splittable.
Edit: Maybe some of that tooling to work with protobufs are internal only or not a part of the external protobuf packages though.