- Language
- C++
- Created
- 08/26/2014
- Last updated
- 10/07/2024
- License
- Other
- autowiki
- Software Version
- u-0.0.1Basic
- Generated from
- Commit
11e0b5
- Generated on
- 10/07/2024
protobuf[Edit section][Copy link]
Protocol Buffers (protobuf) is a data serialization format and library developed by Google. It allows developers to define structured data schemas and generate code for efficiently serializing and deserializing data across different programming languages and platforms. This repository contains the implementation of Protocol Buffers for multiple programming languages, including C++, Java, Python, C#, Objective-C, PHP, Ruby, and Rust.
The core functionality of Protocol Buffers is implemented in the …/protobuf
directory. This includes:
- A compiler that generates language-specific code from protocol buffer definition files, which define the structure of messages. The compiler is implemented in
…/compiler
. - Binary encoding and decoding of messages, implemented in
…/io
. - Utility functions for common operations like JSON conversion and message comparison, found in
…/util
.
The Protocol Buffers library uses several key design choices and algorithms:
- Messages are defined using a language-agnostic schema format.
- The library generates optimized, language-specific code for serialization and deserialization.
- A compact binary wire format is used for serialized data, with variable-length encoding for integers.
- An arena-based memory allocation strategy is employed for efficient memory management, particularly in C++.
For each supported programming language, the repository contains a separate implementation:
- C++: The primary implementation, found in
…/protobuf
. - Java: Located in
java
, with core functionality in…/protobuf
. - Python: Implemented in
…/protobuf
. - C#: Found in
…/Google.Protobuf
. - Objective-C: Located in
objectivec
. - PHP: Implemented in
…/Protobuf
. - Ruby: Found in
ruby
, with implementations for both MRI and JRuby. - Rust: Located in
rust
.
The repository also includes:
- A conformance test suite in
conformance
to ensure consistent behavior across different language implementations. - Benchmarking tools in
benchmarks
for measuring and comparing performance. - The UPB (micro protobuf) library in
upb
, a lightweight C implementation designed for resource-constrained environments.
For more details on specific components, refer to the relevant sections in this wiki, such as Compiler for information on the code generation process, or Core Serialization and Deserialization for details on the binary format and encoding/decoding algorithms.
Compiler[Edit section][Copy link]
References: src/google/protobuf/compiler
The CommandLineInterface
class serves as the main entry point for the Protocol Buffer compiler, handling command-line arguments and orchestrating the code generation process. It allows registration of language-specific code generators through the RegisterGenerator
method, enabling extensibility for different target languages.
Language-Specific Code Generators[Edit section][Copy link]
References: src/google/protobuf/compiler/cpp
, src/google/protobuf/compiler/java
, src/google/protobuf/compiler/python
, src/google/protobuf/compiler/csharp
, src/google/protobuf/compiler/objectivec
, src/google/protobuf/compiler/php
, src/google/protobuf/compiler/ruby
, src/google/protobuf/compiler/rust
The CppGenerator
class in …/generator.h
is the main entry point for C++ code generation. It implements the Generate()
method to produce C++ source and header files from Protocol Buffer definitions.
Core Compiler Components[Edit section][Copy link]
References: src/google/protobuf/compiler
The Parser
class in …/parser.cc
is responsible for parsing .proto files and generating FileDescriptorProto
objects. It handles:
Objective-C Code Generation Enhancements[Edit section][Copy link]
References: src/google/protobuf/compiler/objectivec/generator.cc
, src/google/protobuf/compiler/objectivec/generator.h
In the Objective-C implementation of Protocol Buffers, the ObjectiveCGenerator
class is central to the code generation process. It extends the CodeGenerator
interface to produce Objective-C code from Protocol Buffer definitions. The ObjectiveCGenerator
is defined in …/generator.h
and its functionality is implemented in …/generator.cc
.
Compiler Tools and Utilities[Edit section][Copy link]
References: src/google/protobuf/compiler/cpp/tools
The …/tools
directory contains utilities for analyzing and optimizing protobuf definitions during compilation:
Core Serialization and Deserialization[Edit section][Copy link]
References: src/google/protobuf/io
, src/google/protobuf/util
The CodedInputStream
and CodedOutputStream
classes in …/coded_stream.h
handle reading and writing of protocol buffer messages.
Message Comparison[Edit section][Copy link]
References: src/google/protobuf/util/field_comparator.cc
, src/google/protobuf/util/message_differencer.cc
In …/field_comparator.cc
, the FieldComparator
interface and its SimpleFieldComparator
implementation provide the foundation for comparing fields within Protobuf messages. The SimpleFieldComparator
offers different modes to handle floating-point numbers, such as EXACT
and APPROXIMATE
, allowing for comparisons with tolerances set by SetDefaultFractionAndMargin()
and SetFractionAndMargin()
. This is particularly important for fields where exact matches are not feasible due to the nature of floating-point arithmetic.
Java Implementation[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf
The Java implementation of Protocol Buffers centers around the GeneratedMessageV3
class, which provides a base for generated message types. This class handles common functionality such as message initialization, serialization, comparison, and hashing.
Deprecated GeneratedMessageV3 Features[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf/GeneratedMessageV3.java
The GeneratedMessageV3
class and its nested classes provide a compatibility layer for older generated code in the Protobuf Java core library. Marked for deprecation, these classes include methods and features that are maintained for backward compatibility but are scheduled for removal in the next major version update.
CodedOutputStream Functionality[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf/CodedOutputStream.java
The CodedOutputStream
class is central to the Protocol Buffers encoding system, handling the serialization of primitive values and complex message structures to various output formats. It abstracts the complexity of binary encoding, offering a suite of methods tailored for different data types.
Message Extensions Handling[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf/GeneratedMessage.java
In …/GeneratedMessage.java
, the handling of protobuf message extensions is facilitated through the ExtendableMessage
and ExtendableBuilder
classes. These classes are designed to support extensions, which are additional fields that can be added to a protobuf message outside of its original definition. The ExtendableMessage
class extends GeneratedMessage
, providing the capability to work with extensions in a type-safe manner.
GeneratedMessageLite Enhancements[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf/GeneratedMessageLite.java
The GeneratedMessageLite
class serves as a foundational element for protocol buffer messages, offering a suite of methods for parsing, serializing, and handling message data. It is designed to be a more resource-efficient counterpart to GeneratedMessage
, suitable for environments where minimizing memory footprint is crucial.
TextFormat Handling and Field Reporting[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf/TextFormat.java
The TextFormat
class in …/TextFormat.java
serves as a pivotal interface for developers to parse and print Protocol Buffers messages in a human-readable text format. It comprises two inner classes, Printer
and Parser
, each tailored for formatting and parsing operations respectively.
Recursion Limit Handling in Decoders[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf/ArrayDecoders.java
, java/core/src/main/java/com/google/protobuf/CodedInputStream.java
, java/core/src/main/java/com/google/protobuf/UnknownFieldSchema.java
In the Protocol Buffers Java implementation, handling recursion limits is vital to prevent stack overflow errors when parsing deeply nested messages. The recursion limit functionality is implemented in the CodedInputStream
and ArrayDecoders
classes.
Message Schema Enhancements[Edit section][Copy link]
References: java/core/src/main/java/com/google/protobuf/MessageSchema.java
The MessageSchema
class handles the intricacies of Protobuf message operations in Java, with a focus on field data storage, schema creation, and message comparison and merging. Located at …/MessageSchema.java
, this class is instrumental in the efficient management of message schemas.
Python Implementation[Edit section][Copy link]
References: python/google/protobuf/internal
The Python implementation of Protocol Buffers is primarily contained in the …/internal
directory. The core functionality is implemented through several key components:
Message Handling[Edit section][Copy link]
References: python/google/protobuf/internal/message_test.py
Protocol Buffer messages in Python are handled through a set of core operations:
Read moreOneof Field Handling[Edit section][Copy link]
References: python/google/protobuf/internal/message_test.py
Oneof fields in Protocol Buffers allow only one field from a group to be set at a time. The Python implementation handles these fields through several key operations:
Read moreReflection Capabilities[Edit section][Copy link]
References: python/google/protobuf/internal/reflection_cpp_test.py
The ReflectionTest
class in …/reflection_cpp_test.py
tests the reflection capabilities of the Protobuf library in Python. It focuses on:
Objective-C Implementation[Edit section][Copy link]
References: objectivec
The Objective-C implementation of Protocol Buffers is centered around the GPBMessage
class, which serves as the base for all generated protocol buffer messages. This class, defined in …/GPBMessage.h
, provides methods for parsing, serializing, and manipulating protocol buffer messages.
Message Descriptors[Edit section][Copy link]
References: objectivec/GPBDescriptor_PackagePrivate.h
The GPBDescriptor
class is the central component for representing protocol buffer message descriptors in Objective-C. It provides methods for allocating descriptors and managing their lifecycle. Key features include:
Message Implementation[Edit section][Copy link]
References: objectivec/GPBMessage.m
The GPBMessage
class serves as the foundation for all generated protocol buffer messages in Objective-C. Key aspects of its implementation include:
Utility Functions[Edit section][Copy link]
References: objectivec/GPBUtilities.m
The GPBUtilities.m
file provides essential utility functions for the Objective-C implementation of Protocol Buffers. Key functionalities include:
Testing[Edit section][Copy link]
References: objectivec/Tests
The testing framework for the Objective-C implementation is located in …/Tests
. It includes a comprehensive suite of unit tests covering various aspects of the Protocol Buffers library functionality.
PHP Implementation[Edit section][Copy link]
References: php/src
The PHP implementation of Protocol Buffers is organized into several key classes within the Google\Protobuf
namespace. The DescriptorPool
class manages descriptor information for Protobuf classes, providing methods to retrieve Descriptor
and EnumDescriptor
objects. It uses a singleton pattern with the getGeneratedPool()
method returning a shared instance.
DescriptorPool[Edit section][Copy link]
References: php/src/Google/Protobuf/Internal/DescriptorPool.php
The DescriptorPool
class serves as a central repository for managing Protocol Buffer descriptors in PHP. It employs a singleton pattern, accessible via getGeneratedPool()
, to ensure a single instance throughout the application.
Message Serialization and Deserialization[Edit section][Copy link]
References: php/src/Google/Protobuf/Internal
Message serialization and deserialization in PHP is primarily handled by the CodedInputStream
and CodedOutputStream
classes. These classes work in conjunction with GPBWire
and GPBWireType
to manage the wire format of Protocol Buffer messages.
Type Handling[Edit section][Copy link]
References: php/src/Google/Protobuf/Internal
Protocol buffer types in PHP are handled through a combination of classes and utility functions. The GPBType
class defines constants for various scalar types (e.g., INT32, UINT32, SINT32, FIXED32, SFIXED32, INT64, UINT64, SINT64, FIXED64, SFIXED64, FLOAT, DOUBLE, BOOL, STRING, BYTES) and message types (MESSAGE, GROUP).
Reflection and Metadata[Edit section][Copy link]
References: php/src/Google/Protobuf/Internal
The DescriptorPool
class manages descriptors for protocol buffer messages, enums, and other elements. It provides methods for:
PHP-specific Utilities[Edit section][Copy link]
References: php/src/Google/Protobuf/Internal
The GPBUtil
class provides utility functions for type checking, conversion, and manipulation of Protocol Buffer data types in PHP. Key functionalities include:
Well-Known Types[Edit section][Copy link]
References: php/src/Google/Protobuf
The Google\Protobuf
namespace contains implementations of well-known types, providing standardized representations for common data structures:
Ruby Implementation[Edit section][Copy link]
References: ruby/ext
, ruby/lib
, ruby/src
The Ruby implementation of Protocol Buffers is divided into two main components: a native C extension and a pure Ruby layer.
Read moreRuby Extension Implementation[Edit section][Copy link]
References: ruby/ext/google/protobuf_c
The Ruby extension implementation for Protocol Buffers is primarily contained in the …/protobuf_c
directory. Key components include:
Ruby Library Implementation[Edit section][Copy link]
References: ruby/lib/google/protobuf
The pure Ruby implementation of Protocol Buffers is organized into several modules and classes within the …/protobuf
directory. Key components include:
Ruby Source Files[Edit section][Copy link]
References: ruby/src
The …/src
directory contains additional Ruby source files that support the Protocol Buffers implementation. However, the provided summaries do not reveal any specific Ruby source files or configuration scripts within this directory. Instead, the directory structure suggests a potential Java-related component:
Rust Implementation[Edit section][Copy link]
References: rust
The Rust implementation of Protocol Buffers is centered around the CodeGen
struct in …/lib.rs
. This struct manages the code generation process, allowing users to specify input Protobuf files, output directories, and paths to the protoc
compiler and its plugins.
Rust C++ Interoperability[Edit section][Copy link]
References: rust/cpp.rs
, rust/cpp_kernel/message.cc
Interfacing between Rust and C++ within the Protocol Buffers ecosystem is facilitated through a series of type aliases, structs, and external functions defined in …/cpp.rs
. This file establishes a runtime environment in Rust that is ABI-compatible with C++ implementations of Protocol Buffers, enabling seamless message handling operations.
Rust Message Handling[Edit section][Copy link]
References: rust/cpp.rs
In the Rust implementation of Protocol Buffers, the …/lib.rs
and …/message.rs
files provide the infrastructure for handling protobuf messages, including creation, manipulation, and comparison.
Rust Repeated Fields and Maps[Edit section][Copy link]
References: rust/repeated.rs
, rust/upb.rs
Repeated fields and maps in the Rust implementation of Protocol Buffers are handled primarily through the RepeatedView
, RepeatedMut
, and Repeated
structs for repeated fields, and the Map
, MapMut
, and InnerMap
structs for maps.
Rust Proxied Types[Edit section][Copy link]
References: rust/cpp.rs
The Proxied
trait forms the foundation of the proxy type system, providing a View
type for shared, immutable access to underlying Protobuf field data. This addresses limitations of plain Rust references, such as specific memory layout requirements and the inability to store extra data on references.
Rust Type Conversions[Edit section][Copy link]
References: rust/cpp.rs
In …/cpp.rs
, the CppTypeConversions
trait establishes a framework for translating Rust types to their C++ equivalents, a critical aspect of the Rust-C++ interoperability layer within the Protocol Buffers implementation. This trait encapsulates the conversion logic necessary for Rust types to be used in the context of C++ Protobuf messages, particularly when dealing with repeated fields and maps.
Rust Serialization Utilities[Edit section][Copy link]
References: rust/cpp.rs
In …/cpp.rs
, the Rust runtime for Protocol Buffers leverages the C++ kernel to provide serialization and deserialization utilities. The file introduces the SerializedData
struct, which encapsulates serialized Protobuf wire format data. This struct offers methods to construct and access serialized data, as well as to convert it to a Vec<u8>
, facilitating the handling of serialized message data in Rust.
Rust Map Handling[Edit section][Copy link]
References: rust/cpp_kernel/map.cc
In Rust, the …/map.cc
file provides a map data structure tailored for Protobuf usage, particularly for storing MessageLite
objects. The map supports various key types and offers a suite of operations for managing key-value pairs.
Rust Code Generation[Edit section][Copy link]
References: rust/protobuf_codegen/src/lib.rs
The CodeGen
struct in …/lib.rs
manages the Rust code generation process for Protocol Buffers. Key features include:
Rust Build System Integration[Edit section][Copy link]
References: rust/protobuf_codegen/example/build.rs
The Rust build system integration for Protocol Buffers code generation is implemented in the …/build.rs
file. This file utilizes the protobuf_codegen
crate to generate Rust code from protobuf definition files during the build process. Key aspects of the integration include:
Rust Proto Macro[Edit section][Copy link]
References: rust/proto_macro.rs
The proto!
macro enables the use of Rust struct initialization syntax for creating Protobuf messages. It supports nested messages, array literals, and map literals, providing a concise way to initialize complex message structures.
Rust Crate Distribution[Edit section][Copy link]
References: rust/cargo_test.sh
The distribution of Rust crates for Protobuf is managed through a Bash script that automates the process of packaging, extracting, and testing the crates. The script, located at …/cargo_test.sh
, performs the following key operations:
UPB Library[Edit section][Copy link]
References: upb
The UPB (micro protobuf) library, located in upb
, provides a lightweight C implementation of Protocol Buffers. It offers efficient data serialization and deserialization capabilities through several key components:
Code Generation for C[Edit section][Copy link]
References: upb_generator/c/generator.cc
, upb_generator/c/names.cc
, upb_generator/c/names.h
, upb_generator/c/names_internal.cc
, upb_generator/c/names_internal.h
In …/generator.cc
, the code generation for C involves creating the necessary C representations from Protobuf definitions. This process is tailored to accommodate various stages of development through the concept of "bootstrap" stages, which influence the generated code's complexity and dependencies.
Code Generation Interface[Edit section][Copy link]
References: upb_generator/reflection/generator.cc
In …/generator.cc
, the interface functions orchestrate the generation of C/C++ header and source files that are essential for the Protobuf definition pool and accessor functions. The process begins with the parsing of command-line options, which influence the generation of export declarations in the output files. The dllexport_decl
option is a notable example, guiding the creation of DLL-compatible code.
Fast Decode Table[Edit section][Copy link]
References: upb_generator/minitable/fasttable.cc
, upb_generator/minitable/fasttable.h
The FastDecodeTable()
function in …/fasttable.cc
generates an optimized table for efficient Protobuf message parsing. Key aspects of the implementation include:
Byte Size Calculation[Edit section][Copy link]
References: upb/wire/byte_size.c
, upb/wire/byte_size.h
The upb_ByteSize()
function calculates the byte size of a Protobuf message. It takes a upb_Message*
and a upb_MiniTable*
as input and returns the size as a size_t
. The calculation process involves:
Mini-Table Generation[Edit section][Copy link]
References: upb_generator/minitable/generator.cc
, upb_generator/minitable/main.cc
, upb_generator/minitable/names.cc
, upb_generator/minitable/names.h
, upb_generator/minitable/names_internal.cc
, upb_generator/minitable/names_internal.h
Mini-table generation is handled primarily by the WriteMiniTableHeader()
, WriteMiniTableSource()
, and WriteMiniTableMultipleSources()
functions in …/generator.cc
. These functions generate the necessary C code for representing Protocol Buffer messages, enums, and extensions in a compact format.
Common Utilities[Edit section][Copy link]
References: upb_generator/common/names.cc
, upb_generator/common/names.h
, upb_generator/common.cc
, upb_generator/common.h
The UPB library provides a set of common utility functions for name mangling and identifier generation, primarily used in code generation tasks. These utilities are implemented in the …/common
directory.
Conformance Testing[Edit section][Copy link]
References: conformance
The ConformanceTestSuite
class in …/conformance_test.h
manages the execution of conformance tests. It provides:
Conformance Test Suite[Edit section][Copy link]
References: conformance/binary_json_conformance_suite.cc
The BinaryAndJsonConformanceSuite
class in …/binary_json_conformance_suite.cc
implements a comprehensive test suite for Protocol Buffer binary and JSON formats. Key features include:
Failure Lists[Edit section][Copy link]
References: conformance/failure_list_php_c.txt
, conformance/failure_list_ruby.txt
, conformance/failure_list_jruby_ffi.txt
, upb/conformance/conformance_upb_failures.txt
Failure lists document expected failures in conformance tests for different language implementations of Protocol Buffers. These lists are maintained in separate files for PHP, Ruby, JRuby FFI, and the UPB library.
Read moreBenchmarking[Edit section][Copy link]
References: benchmarks
The benchmarking tools in the benchmarks
directory provide a comprehensive suite for measuring Protocol Buffers performance. Key components include: