Skip to content

LaurensVijnck/ProtoGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proto-to-BigQuery

Proto-to-BigQuery is a Protocol Buffers compiler plugin.

Automatic schema migration

Proto-to-BQ can be leveraged to create workflows supporting end-to-end automatic schema migration.

Proto-to-BQ extracts a BigQuery table schema from an annotated Protobuf file. The plugin additionally generates associated mapper functions to convert a Protobuf message to a BigQuery table row. The resulting table row can subsequently be inserted into the BigQuery table as its guaranteed to satisfy the table schema.

Real-time ingestion

The repository includes a generic pipeline template that uses the Proto-to-BQ to produce a streaming BigQuery ingestion pipeline.

Infrastructure-as-Code

The repository includes the source code to create the BigQuery tables through Terraform. Terraform is currently in charge of updating the BigQuery table schemas according to the output of the proto-to-bq plugin.

Supported features

Table-level features

  • Table name
  • Table description
  • Table partitioning
  • Range partitioning
  • Clustering

Attribute-level features

  • Aliases
  • Field description
  • Default values
  • Batch attribute*

Annotating a protocol buffers file

message Tag {
    string tag_name = 1;
    string tag_code = 2;
    string tag_namespace = 3;
}

message BatchEvent {
    repeated Tag tags = 1 [(description) = "Tags associated with the event"];
}

message Event {
    // Bigquery meta data
    option (bq_root_options).table_name = "event_table";
    option (bq_root_options).table_description = "ProtoToBQ generated table for events";
    option (bq_root_options).time_partitioning = true;
    option (bq_root_options).time_partitioning_expiration_ms = 15552000000;

    // Fields
    Client client = 1  [(required) = false, (description) = "Owner of the event"];
    repeated BatchEvent events = 2 [(required) = true, (batch_attribute) = true];
    optional int64 epoch_timestamp_millis = 3 [(required) = true, (time_partitioning_attribute) = true, (timestamp_attribute) = true, (alias) = "event_time"];
    optional string tenant_id = 4 [(required) = true, (clustering_attribute) = true];
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published