Language
C++

Created
08/26/2014

Last updated
10/07/2024

License
Other
autowiki
Software Version
u-0.0.1Basic

Generated from
Commit 11e0b5

Generated on
10/07/2024

protobuf
[Edit section]
[Copy link]

Protocol Buffers (protobuf) is a data serialization format and library developed by Google. It allows developers to define structured data schemas and generate code for efficiently serializing and deserializing data across different programming languages and platforms. This repository contains the implementation of Protocol Buffers for multiple programming languages, including C++, Java, Python, C#, Objective-C, PHP, Ruby, and Rust.

The core functionality of Protocol Buffers is implemented in the …/protobuf directory. This includes:

  • A compiler that generates language-specific code from protocol buffer definition files, which define the structure of messages. The compiler is implemented in …/compiler.
  • Binary encoding and decoding of messages, implemented in …/io.
  • Utility functions for common operations like JSON conversion and message comparison, found in …/util.

The Protocol Buffers library uses several key design choices and algorithms:

  • Messages are defined using a language-agnostic schema format.
  • The library generates optimized, language-specific code for serialization and deserialization.
  • A compact binary wire format is used for serialized data, with variable-length encoding for integers.
  • An arena-based memory allocation strategy is employed for efficient memory management, particularly in C++.

For each supported programming language, the repository contains a separate implementation:

The repository also includes:

  • A conformance test suite in conformance to ensure consistent behavior across different language implementations.
  • Benchmarking tools in benchmarks for measuring and comparing performance.
  • The UPB (micro protobuf) library in upb, a lightweight C implementation designed for resource-constrained environments.

For more details on specific components, refer to the relevant sections in this wiki, such as Compiler for information on the code generation process, or Core Serialization and Deserialization for details on the binary format and encoding/decoding algorithms.

Compiler
[Edit section]
[Copy link]

References: src/google/protobuf/compiler

The CommandLineInterface class serves as the main entry point for the Protocol Buffer compiler, handling command-line arguments and orchestrating the code generation process. It allows registration of language-specific code generators through the RegisterGenerator method, enabling extensibility for different target languages.

Read more

Language-Specific Code Generators
[Edit section]
[Copy link]

References: src/google/protobuf/compiler/cpp, src/google/protobuf/compiler/java, src/google/protobuf/compiler/python, src/google/protobuf/compiler/csharp, src/google/protobuf/compiler/objectivec, src/google/protobuf/compiler/php, src/google/protobuf/compiler/ruby, src/google/protobuf/compiler/rust

• • •
Architecture Diagram for Language-Specific Code Generators
Architecture Diagram for Language-Specific Code Generators

The CppGenerator class in …/generator.h is the main entry point for C++ code generation. It implements the Generate() method to produce C++ source and header files from Protocol Buffer definitions.

Read more

Core Compiler Components
[Edit section]
[Copy link]

References: src/google/protobuf/compiler

• • •
Architecture Diagram for Core Compiler Components
Architecture Diagram for Core Compiler Components

The Parser class in …/parser.cc is responsible for parsing .proto files and generating FileDescriptorProto objects. It handles:

Read more

Objective-C Code Generation Enhancements
[Edit section]
[Copy link]

References: src/google/protobuf/compiler/objectivec/generator.cc, src/google/protobuf/compiler/objectivec/generator.h

In the Objective-C implementation of Protocol Buffers, the ObjectiveCGenerator class is central to the code generation process. It extends the CodeGenerator interface to produce Objective-C code from Protocol Buffer definitions. The ObjectiveCGenerator is defined in …/generator.h and its functionality is implemented in …/generator.cc.

Read more

Compiler Tools and Utilities
[Edit section]
[Copy link]

References: src/google/protobuf/compiler/cpp/tools

• • •
Architecture Diagram for Compiler Tools and Utilities
Architecture Diagram for Compiler Tools and Utilities

The …/tools directory contains utilities for analyzing and optimizing protobuf definitions during compilation:

Read more

Core Serialization and Deserialization
[Edit section]
[Copy link]

References: src/google/protobuf/io, src/google/protobuf/util

• • •
Architecture Diagram for Core Serialization and Deserialization
Architecture Diagram for Core Serialization and Deserialization

The CodedInputStream and CodedOutputStream classes in …/coded_stream.h handle reading and writing of protocol buffer messages.

Read more

Message Comparison
[Edit section]
[Copy link]

References: src/google/protobuf/util/field_comparator.cc, src/google/protobuf/util/message_differencer.cc

• • •
Architecture Diagram for Message Comparison
Architecture Diagram for Message Comparison

In …/field_comparator.cc, the FieldComparator interface and its SimpleFieldComparator implementation provide the foundation for comparing fields within Protobuf messages. The SimpleFieldComparator offers different modes to handle floating-point numbers, such as EXACT and APPROXIMATE, allowing for comparisons with tolerances set by SetDefaultFractionAndMargin() and SetFractionAndMargin(). This is particularly important for fields where exact matches are not feasible due to the nature of floating-point arithmetic.

Read more

Java Implementation
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf

The Java implementation of Protocol Buffers centers around the GeneratedMessageV3 class, which provides a base for generated message types. This class handles common functionality such as message initialization, serialization, comparison, and hashing.

Read more

Deprecated GeneratedMessageV3 Features
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf/GeneratedMessageV3.java

The GeneratedMessageV3 class and its nested classes provide a compatibility layer for older generated code in the Protobuf Java core library. Marked for deprecation, these classes include methods and features that are maintained for backward compatibility but are scheduled for removal in the next major version update.

Read more

CodedOutputStream Functionality
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf/CodedOutputStream.java

The CodedOutputStream class is central to the Protocol Buffers encoding system, handling the serialization of primitive values and complex message structures to various output formats. It abstracts the complexity of binary encoding, offering a suite of methods tailored for different data types.

Read more

Message Extensions Handling
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf/GeneratedMessage.java

• • •
Architecture Diagram for Message Extensions Handling
Architecture Diagram for Message Extensions Handling

In …/GeneratedMessage.java, the handling of protobuf message extensions is facilitated through the ExtendableMessage and ExtendableBuilder classes. These classes are designed to support extensions, which are additional fields that can be added to a protobuf message outside of its original definition. The ExtendableMessage class extends GeneratedMessage, providing the capability to work with extensions in a type-safe manner.

Read more

GeneratedMessageLite Enhancements
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf/GeneratedMessageLite.java

The GeneratedMessageLite class serves as a foundational element for protocol buffer messages, offering a suite of methods for parsing, serializing, and handling message data. It is designed to be a more resource-efficient counterpart to GeneratedMessage, suitable for environments where minimizing memory footprint is crucial.

Read more

TextFormat Handling and Field Reporting
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf/TextFormat.java

• • •
Architecture Diagram for TextFormat Handling and Field Reporting
Architecture Diagram for TextFormat Handling and Field Reporting

The TextFormat class in …/TextFormat.java serves as a pivotal interface for developers to parse and print Protocol Buffers messages in a human-readable text format. It comprises two inner classes, Printer and Parser, each tailored for formatting and parsing operations respectively.

Read more

Recursion Limit Handling in Decoders
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf/ArrayDecoders.java, java/core/src/main/java/com/google/protobuf/CodedInputStream.java, java/core/src/main/java/com/google/protobuf/UnknownFieldSchema.java

• • •
Architecture Diagram for Recursion Limit Handling in Decoders
Architecture Diagram for Recursion Limit Handling in Decoders

In the Protocol Buffers Java implementation, handling recursion limits is vital to prevent stack overflow errors when parsing deeply nested messages. The recursion limit functionality is implemented in the CodedInputStream and ArrayDecoders classes.

Read more

Message Schema Enhancements
[Edit section]
[Copy link]

References: java/core/src/main/java/com/google/protobuf/MessageSchema.java

The MessageSchema class handles the intricacies of Protobuf message operations in Java, with a focus on field data storage, schema creation, and message comparison and merging. Located at …/MessageSchema.java, this class is instrumental in the efficient management of message schemas.

Read more

Python Implementation
[Edit section]
[Copy link]

References: python/google/protobuf/internal

The Python implementation of Protocol Buffers is primarily contained in the …/internal directory. The core functionality is implemented through several key components:

Read more

Message Handling
[Edit section]
[Copy link]

References: python/google/protobuf/internal/message_test.py

• • •
Architecture Diagram for Message Handling
Architecture Diagram for Message Handling

Protocol Buffer messages in Python are handled through a set of core operations:

Read more

Oneof Field Handling
[Edit section]
[Copy link]

References: python/google/protobuf/internal/message_test.py

• • •
Architecture Diagram for Oneof Field Handling
Architecture Diagram for Oneof Field Handling

Oneof fields in Protocol Buffers allow only one field from a group to be set at a time. The Python implementation handles these fields through several key operations:

Read more

Reflection Capabilities
[Edit section]
[Copy link]

References: python/google/protobuf/internal/reflection_cpp_test.py

• • •
Architecture Diagram for Reflection Capabilities
Architecture Diagram for Reflection Capabilities

The ReflectionTest class in …/reflection_cpp_test.py tests the reflection capabilities of the Protobuf library in Python. It focuses on:

Read more

Objective-C Implementation
[Edit section]
[Copy link]

References: objectivec

The Objective-C implementation of Protocol Buffers is centered around the GPBMessage class, which serves as the base for all generated protocol buffer messages. This class, defined in …/GPBMessage.h, provides methods for parsing, serializing, and manipulating protocol buffer messages.

Read more

Message Descriptors
[Edit section]
[Copy link]

References: objectivec/GPBDescriptor_PackagePrivate.h

• • •
Architecture Diagram for Message Descriptors
Architecture Diagram for Message Descriptors

The GPBDescriptor class is the central component for representing protocol buffer message descriptors in Objective-C. It provides methods for allocating descriptors and managing their lifecycle. Key features include:

Read more

Message Implementation
[Edit section]
[Copy link]

References: objectivec/GPBMessage.m

The GPBMessage class serves as the foundation for all generated protocol buffer messages in Objective-C. Key aspects of its implementation include:

Read more

Utility Functions
[Edit section]
[Copy link]

References: objectivec/GPBUtilities.m

The GPBUtilities.m file provides essential utility functions for the Objective-C implementation of Protocol Buffers. Key functionalities include:

Read more

Testing
[Edit section]
[Copy link]

References: objectivec/Tests

The testing framework for the Objective-C implementation is located in …/Tests. It includes a comprehensive suite of unit tests covering various aspects of the Protocol Buffers library functionality.

Read more

PHP Implementation
[Edit section]
[Copy link]

References: php/src

• • •
Architecture Diagram for PHP Implementation
Architecture Diagram for PHP Implementation

The PHP implementation of Protocol Buffers is organized into several key classes within the Google\Protobuf namespace. The DescriptorPool class manages descriptor information for Protobuf classes, providing methods to retrieve Descriptor and EnumDescriptor objects. It uses a singleton pattern with the getGeneratedPool() method returning a shared instance.

Read more

DescriptorPool
[Edit section]
[Copy link]

References: php/src/Google/Protobuf/Internal/DescriptorPool.php

The DescriptorPool class serves as a central repository for managing Protocol Buffer descriptors in PHP. It employs a singleton pattern, accessible via getGeneratedPool(), to ensure a single instance throughout the application.

Read more

Message Serialization and Deserialization
[Edit section]
[Copy link]

References: php/src/Google/Protobuf/Internal

• • •
Architecture Diagram for Message Serialization and Deserialization
Architecture Diagram for Message Serialization and Deserialization

Message serialization and deserialization in PHP is primarily handled by the CodedInputStream and CodedOutputStream classes. These classes work in conjunction with GPBWire and GPBWireType to manage the wire format of Protocol Buffer messages.

Read more

Type Handling
[Edit section]
[Copy link]

References: php/src/Google/Protobuf/Internal

• • •
Architecture Diagram for Type Handling
Architecture Diagram for Type Handling

Protocol buffer types in PHP are handled through a combination of classes and utility functions. The GPBType class defines constants for various scalar types (e.g., INT32, UINT32, SINT32, FIXED32, SFIXED32, INT64, UINT64, SINT64, FIXED64, SFIXED64, FLOAT, DOUBLE, BOOL, STRING, BYTES) and message types (MESSAGE, GROUP).

Read more

Reflection and Metadata
[Edit section]
[Copy link]

References: php/src/Google/Protobuf/Internal

• • •
Architecture Diagram for Reflection and Metadata
Architecture Diagram for Reflection and Metadata

The DescriptorPool class manages descriptors for protocol buffer messages, enums, and other elements. It provides methods for:

Read more

PHP-specific Utilities
[Edit section]
[Copy link]

References: php/src/Google/Protobuf/Internal

The GPBUtil class provides utility functions for type checking, conversion, and manipulation of Protocol Buffer data types in PHP. Key functionalities include:

Read more

Well-Known Types
[Edit section]
[Copy link]

References: php/src/Google/Protobuf

The Google\Protobuf namespace contains implementations of well-known types, providing standardized representations for common data structures:

Read more

Ruby Implementation
[Edit section]
[Copy link]

References: ruby/ext, ruby/lib, ruby/src

• • •
Architecture Diagram for Ruby Implementation
Architecture Diagram for Ruby Implementation

The Ruby implementation of Protocol Buffers is divided into two main components: a native C extension and a pure Ruby layer.

Read more

Ruby Extension Implementation
[Edit section]
[Copy link]

References: ruby/ext/google/protobuf_c

• • •
Architecture Diagram for Ruby Extension Implementation
Architecture Diagram for Ruby Extension Implementation

The Ruby extension implementation for Protocol Buffers is primarily contained in the …/protobuf_c directory. Key components include:

Read more

Ruby Library Implementation
[Edit section]
[Copy link]

References: ruby/lib/google/protobuf

• • •
Architecture Diagram for Ruby Library Implementation
Architecture Diagram for Ruby Library Implementation

The pure Ruby implementation of Protocol Buffers is organized into several modules and classes within the …/protobuf directory. Key components include:

Read more

Ruby Source Files
[Edit section]
[Copy link]

References: ruby/src

• • •
Architecture Diagram for Ruby Source Files
Architecture Diagram for Ruby Source Files

The …/src directory contains additional Ruby source files that support the Protocol Buffers implementation. However, the provided summaries do not reveal any specific Ruby source files or configuration scripts within this directory. Instead, the directory structure suggests a potential Java-related component:

Read more

Rust Implementation
[Edit section]
[Copy link]

References: rust

• • •
Architecture Diagram for Rust Implementation
Architecture Diagram for Rust Implementation

The Rust implementation of Protocol Buffers is centered around the CodeGen struct in …/lib.rs. This struct manages the code generation process, allowing users to specify input Protobuf files, output directories, and paths to the protoc compiler and its plugins.

Read more

Rust C++ Interoperability
[Edit section]
[Copy link]

References: rust/cpp.rs, rust/cpp_kernel/message.cc

• • •
Architecture Diagram for Rust C++ Interoperability
Architecture Diagram for Rust C++ Interoperability

Interfacing between Rust and C++ within the Protocol Buffers ecosystem is facilitated through a series of type aliases, structs, and external functions defined in …/cpp.rs. This file establishes a runtime environment in Rust that is ABI-compatible with C++ implementations of Protocol Buffers, enabling seamless message handling operations.

Read more

Rust Message Handling
[Edit section]
[Copy link]

References: rust/cpp.rs

• • •
Architecture Diagram for Rust Message Handling
Architecture Diagram for Rust Message Handling

In the Rust implementation of Protocol Buffers, the …/lib.rs and …/message.rs files provide the infrastructure for handling protobuf messages, including creation, manipulation, and comparison.

Read more

Rust Repeated Fields and Maps
[Edit section]
[Copy link]

References: rust/repeated.rs, rust/upb.rs

• • •
Architecture Diagram for Rust Repeated Fields and Maps
Architecture Diagram for Rust Repeated Fields and Maps

Repeated fields and maps in the Rust implementation of Protocol Buffers are handled primarily through the RepeatedView, RepeatedMut, and Repeated structs for repeated fields, and the Map, MapMut, and InnerMap structs for maps.

Read more

Rust Proxied Types
[Edit section]
[Copy link]

References: rust/cpp.rs

• • •
Architecture Diagram for Rust Proxied Types
Architecture Diagram for Rust Proxied Types

The Proxied trait forms the foundation of the proxy type system, providing a View type for shared, immutable access to underlying Protobuf field data. This addresses limitations of plain Rust references, such as specific memory layout requirements and the inability to store extra data on references.

Read more

Rust Type Conversions
[Edit section]
[Copy link]

References: rust/cpp.rs

• • •
Architecture Diagram for Rust Type Conversions
Architecture Diagram for Rust Type Conversions

In …/cpp.rs, the CppTypeConversions trait establishes a framework for translating Rust types to their C++ equivalents, a critical aspect of the Rust-C++ interoperability layer within the Protocol Buffers implementation. This trait encapsulates the conversion logic necessary for Rust types to be used in the context of C++ Protobuf messages, particularly when dealing with repeated fields and maps.

Read more

Rust Serialization Utilities
[Edit section]
[Copy link]

References: rust/cpp.rs

• • •
Architecture Diagram for Rust Serialization Utilities
Architecture Diagram for Rust Serialization Utilities

In …/cpp.rs, the Rust runtime for Protocol Buffers leverages the C++ kernel to provide serialization and deserialization utilities. The file introduces the SerializedData struct, which encapsulates serialized Protobuf wire format data. This struct offers methods to construct and access serialized data, as well as to convert it to a Vec<u8>, facilitating the handling of serialized message data in Rust.

Read more

Rust Map Handling
[Edit section]
[Copy link]

References: rust/cpp_kernel/map.cc

• • •
Architecture Diagram for Rust Map Handling
Architecture Diagram for Rust Map Handling

In Rust, the …/map.cc file provides a map data structure tailored for Protobuf usage, particularly for storing MessageLite objects. The map supports various key types and offers a suite of operations for managing key-value pairs.

Read more

Rust Code Generation
[Edit section]
[Copy link]

References: rust/protobuf_codegen/src/lib.rs

• • •
Architecture Diagram for Rust Code Generation
Architecture Diagram for Rust Code Generation

The CodeGen struct in …/lib.rs manages the Rust code generation process for Protocol Buffers. Key features include:

Read more

Rust Build System Integration
[Edit section]
[Copy link]

References: rust/protobuf_codegen/example/build.rs

• • •
Architecture Diagram for Rust Build System Integration
Architecture Diagram for Rust Build System Integration

The Rust build system integration for Protocol Buffers code generation is implemented in the …/build.rs file. This file utilizes the protobuf_codegen crate to generate Rust code from protobuf definition files during the build process. Key aspects of the integration include:

Read more

Rust Proto Macro
[Edit section]
[Copy link]

References: rust/proto_macro.rs

The proto! macro enables the use of Rust struct initialization syntax for creating Protobuf messages. It supports nested messages, array literals, and map literals, providing a concise way to initialize complex message structures.

Read more

Rust Crate Distribution
[Edit section]
[Copy link]

References: rust/cargo_test.sh

• • •
Architecture Diagram for Rust Crate Distribution
Architecture Diagram for Rust Crate Distribution

The distribution of Rust crates for Protobuf is managed through a Bash script that automates the process of packaging, extracting, and testing the crates. The script, located at …/cargo_test.sh, performs the following key operations:

Read more

UPB Library
[Edit section]
[Copy link]

References: upb

The UPB (micro protobuf) library, located in upb, provides a lightweight C implementation of Protocol Buffers. It offers efficient data serialization and deserialization capabilities through several key components:

Read more

Code Generation for C
[Edit section]
[Copy link]

References: upb_generator/c/generator.cc, upb_generator/c/names.cc, upb_generator/c/names.h, upb_generator/c/names_internal.cc, upb_generator/c/names_internal.h

• • •
Architecture Diagram for Code Generation for C
Architecture Diagram for Code Generation for C

In …/generator.cc, the code generation for C involves creating the necessary C representations from Protobuf definitions. This process is tailored to accommodate various stages of development through the concept of "bootstrap" stages, which influence the generated code's complexity and dependencies.

Read more

Code Generation Interface
[Edit section]
[Copy link]

References: upb_generator/reflection/generator.cc

• • •
Architecture Diagram for Code Generation Interface
Architecture Diagram for Code Generation Interface

In …/generator.cc, the interface functions orchestrate the generation of C/C++ header and source files that are essential for the Protobuf definition pool and accessor functions. The process begins with the parsing of command-line options, which influence the generation of export declarations in the output files. The dllexport_decl option is a notable example, guiding the creation of DLL-compatible code.

Read more

Fast Decode Table
[Edit section]
[Copy link]

References: upb_generator/minitable/fasttable.cc, upb_generator/minitable/fasttable.h

• • •
Architecture Diagram for Fast Decode Table
Architecture Diagram for Fast Decode Table

The FastDecodeTable() function in …/fasttable.cc generates an optimized table for efficient Protobuf message parsing. Key aspects of the implementation include:

Read more

Byte Size Calculation
[Edit section]
[Copy link]

References: upb/wire/byte_size.c, upb/wire/byte_size.h

• • •
Architecture Diagram for Byte Size Calculation
Architecture Diagram for Byte Size Calculation

The upb_ByteSize() function calculates the byte size of a Protobuf message. It takes a upb_Message* and a upb_MiniTable* as input and returns the size as a size_t. The calculation process involves:

Read more

Mini-Table Generation
[Edit section]
[Copy link]

References: upb_generator/minitable/generator.cc, upb_generator/minitable/main.cc, upb_generator/minitable/names.cc, upb_generator/minitable/names.h, upb_generator/minitable/names_internal.cc, upb_generator/minitable/names_internal.h

• • •
Architecture Diagram for Mini-Table Generation
Architecture Diagram for Mini-Table Generation

Mini-table generation is handled primarily by the WriteMiniTableHeader(), WriteMiniTableSource(), and WriteMiniTableMultipleSources() functions in …/generator.cc. These functions generate the necessary C code for representing Protocol Buffer messages, enums, and extensions in a compact format.

Read more

Common Utilities
[Edit section]
[Copy link]

References: upb_generator/common/names.cc, upb_generator/common/names.h, upb_generator/common.cc, upb_generator/common.h

• • •
Architecture Diagram for Common Utilities
Architecture Diagram for Common Utilities

The UPB library provides a set of common utility functions for name mangling and identifier generation, primarily used in code generation tasks. These utilities are implemented in the …/common directory.

Read more

Conformance Testing
[Edit section]
[Copy link]

References: conformance

• • •
Architecture Diagram for Conformance Testing
Architecture Diagram for Conformance Testing

The ConformanceTestSuite class in …/conformance_test.h manages the execution of conformance tests. It provides:

Read more

Conformance Test Suite
[Edit section]
[Copy link]

References: conformance/binary_json_conformance_suite.cc

• • •
Architecture Diagram for Conformance Test Suite
Architecture Diagram for Conformance Test Suite

The BinaryAndJsonConformanceSuite class in …/binary_json_conformance_suite.cc implements a comprehensive test suite for Protocol Buffer binary and JSON formats. Key features include:

Read more

Failure Lists
[Edit section]
[Copy link]

References: conformance/failure_list_php_c.txt, conformance/failure_list_ruby.txt, conformance/failure_list_jruby_ffi.txt, upb/conformance/conformance_upb_failures.txt

• • •
Architecture Diagram for Failure Lists
Architecture Diagram for Failure Lists

Failure lists document expected failures in conformance tests for different language implementations of Protocol Buffers. These lists are maintained in separate files for PHP, Ruby, JRuby FFI, and the UPB library.

Read more

Benchmarking
[Edit section]
[Copy link]

References: benchmarks

• • •
Architecture Diagram for Benchmarking
Architecture Diagram for Benchmarking

The benchmarking tools in the benchmarks directory provide a comprehensive suite for measuring Protocol Buffers performance. Key components include:

Read more