Skip to content

APIs stack and repositories structure

Leandro Lucarella edited this page Mar 28, 2024 · 7 revisions

Glossary

  • API: An "Application programming interface". In this context is the definition on how to interact with a certain service. Implementation-wise is a set of proto files defining a gRPC interface. This is programming language agnostic.

  • Binding: Programming language-specific code that implements the API messages as structures and RPC calls as functions. Usually automatically generated by protoc (a command-line tool to generate language-specific code from proto files). Bindings can be used to implement an API service and/or client. We only generate public bindings for Python for now.

  • Service: A process (daemon, server) that implements an API. This is implemented using using bindings for the language used to build the service, for example Rust or Go.

  • Client: A library that can be used to connect and interact with a service. These are programming language-specific and use the bindings for this language underneath. It provides a more convenient an idiomatic interface than the (automatically-generated) bindings. For Python it also convert gRPC streams to use frequenz-channels-python.

  • Actor: An actor in the context of APIs usually refers to an actor that uses the client to implement the client-side business logic of the API. It usually has some background tasks talking to the service and keeping some state. It is also a library.

  • High-level interface: Another library layer that provides a more convenient and idiomatic way to interact with the actor by using simple function calls to send information to it, or channels to receive information from it. It is usually in charge of starting the actor and managing its life-cycle too.

  • Application: A process (daemon, server) that usually mix several API's high-level interfaces (although it can use only one too) to implement business logic or "use cases", like peak shaving or FCR.

Stack Diagram

Stack

Repository structure

Current state of affairs

Currently we have the following repository structure to represent the above stack:

  • frequenz-client-base-python: Python library to write async gRPC clients using channels.

  • frequenz-api-xxx: The API (protobuf definitions) and the Python Bindings. Sometimes it also contains the Python Client.

    Internal dependencies: frequenz-api-common. If it contains the Client too, then it also depends on frequenz-client-base-python.

  • frequenz-service-xxx: The Service implementation.

    Internal dependencies: frequenz-api-xxx.

  • frequenz-client-xxx-python: The Python Client implementation.

    Internal dependencies: frequenz-api-xxx, frequenz-client-base-python.

  • frequenz-actor-xxx: The Actor implementation.

    Internal dependencies: frequenz-api-xxx or frequenz-client-xxx if there is one, frequenz-channels-python, frequenz-sdk-python.

  • frequenz-sdk-python: The High-level interface implementation.

    Internal dependencies: frequenz-api-xxx or frequenz-client-xxx or frequenz-actor-xxx if there is one, frequenz-channels-python.

    So far we only have one High-level interface implementation, for the frequenz-api-microgrid. And the Microgrid API doesn't have a client or actor repository yet, all of them live in the SDK.

API (protobuf) dependencies

All APIs protobuf definitions depend on the frequenz-api-common repository, which host some common definitions shared by many APIs.

The frequenz-api-common repository depends on the googleapis/api-common-protos which host also some core extensions to the base protobuf types. So all APIs depend on this repository too, at least indirectly.

Some APIs will also depend on googleapis/api-common-protos directly.

flowchart BT
    google[googleapis/api-common-protos]
    common[frequenz-api-common]
    api1[frequenz-api-xxx]
    api2[frequenz-api-yyy]
    apiN[frequenz-api-...]


    api1 ---> google
    common --> google

    style s opacity:0
    subgraph s[" "]
        api1 --> common
        api2 --> common
        apiN --> common
    end

    apiN -.->|sometimes| google

Future direction

We want to have a more consistent approach for mapping the stack layers to repositories. This is the bare minimum of splitting and homogenization we want to achieve:

  • frequenz-client-base-python: Python library to write async gRPC clients using channels.

  • frequenz-api-xxx: The API (protobuf definitions).

    Internal dependencies: frequenz-api-common.

  • frequenz-service-xxx: The Service implementation.

    Internal dependencies: frequenz-api-xxx.

  • frequenz-client-xxx-python: The Python Client implementation and the Python Bindings. It is still in discussion if we want to have one repository for the Bindings separated from the Client.

    Internal dependencies: frequenz-api-xxx, frequenz-client-base-python.

  • frequenz-actor-xxx: The Actor implementation.

    Internal dependencies: frequenz-api-xxx or frequenz-client-xxx if there is one, frequenz-channels-python, frequenz-sdk-python.

  • frequenz-sdk-python: The High-level interface implementation.

Still in discussion

Further splitting

We probably will want to eventually separate the Bindings and Client into independent repositories, as the versions and release cycles could differ.

We might also want to put the High-level interface into its own repository, so we can make the dependency on different APIs in the SDK optional, otherwise if one wants to write an Application that only uses the Electricity Trading API (for example), it is not necessary to pull all the dependencies for all other APIs. This would mean splitting the SDK in smaller reusable packages, as we did with frequenz-channels-python, like frequenz-actor-python, frequenz-quantity-python, etc.

See https://github.com/frequenz-floss/frequenz-sdk-python/discussions/854 for more details.

  • frequenz-api-bindings-xxx-python: The Python Bindings (only automatically generated files, in the Python package ).

    Internal dependencies: frequenz-api-xxx.

  • frequenz-client-xxx-python: The Python Client implementation (only manually written code).

    Internal dependencies: frequenz-api-bindings-xxx-python, frequenz-client-base-python.

  • frequenz-actor-xxx: The Actor implementation.

    Internal dependencies: frequenz-client-xxx, frequenz-channels-python (frequenz-actor-python, frequenz-quantity-python, etc.).

  • frequenz-xxx-python: The High-level interface implementation.

    Internal dependencies: frequenz-actor-xxx, frequenz-channels-python (frequenz-actor-python, frequenz-quantity-python, etc.).

  • frequenz-sdk-python: Glue code to bring different APIs and common infrastructure (like configuration and logging management) together.

    Internal dependencies: frequenz-xxx-python, frequenz-yyy-python, ..., frequenz-channels-python, frequenz-actor-python, frequenz-quantity-python, etc.

Completely rewrite of the structure

When we initially started with the current structure, we were very SDK-centric. We saw the SDK mostly as a mono-repo, containing all the clients for all APIs, actors and high-level structure. This is what we have now with the microgrid API. So in terms also of Python modules structure, we have most stuff in frequenz.sdk.xxx.

It is starting to get more evident that there would be an advantage of being more API-centric (or subsystem-centric), and make the SDK just the glue where everything comes together.

Having too many repositories has also proven to be too confusing. Splitting is necessary to keep repos manageable and to be able to control release cycles and introducing breaking changes independently, but they should be kept to a minimum.

There is also the fact that using api for the protobuf definitions has been confusing, as the API term is very broad, and it could convey also the server/service or client. So having a repo called frequenz-api-xxx causes a lot of confusion too. It is also not obvious that that contains the protobuf/grpc definitions. The reason for having api in there was that we didn't want to expose which underlying protocol the API is using, as we could offer also other interfaces, like REST or GraphQL in the future. Still it seems useful to have the information of exactly which kind of specification the repo is using, and if we need to have another specification it should probably be done in a different repo.

The api-common repository also caused a lot of confusion, as it spawned a client-common which is even more confusing, as there really no client there and there is also client-base, which adds even more confusing.

To address all of those issues, another much more radical proposal would to have APIs at the top-level namespace inside frequenz instead, and having a completely separate namespace for common and utility stuff:

  • frequenz-grpcclient-python (common utilities for implementing gRPC clients in Python, module: frequenz.grpcclient)
  • frequenz-common-proto-spec (only common .proto files, Protobuf package: frequenz.api.common)
  • frequenz-common-proto-bindings-python (only stuff generated from frequenz-common-proto-spec for Python, module: frequenz.api.common)
  • frequenz-common-proto-python (hand-written wrappers for frequenz-common-bindings-python, module: frequenz.common.proto)

And for each API/subsystem (here xxx is an API, like dispatch, microgrid, reporting, etc.):

  • frequenz-xxx-grpc-spec (only .proto files, Protobuf package: frequenz.api.xxx)
  • frequenz-xxx-service (server implementation for the API, uses frequenz-xxx-grpc-spec)
  • frequenz-xxx-bindings-python (only stuff generated from frequenz-xxx-grpc-spec for Python, module: frequenz.api.xxx)
  • frequenz-xxx-client-python (Python client implementation, includes manually-written wrappers for frequenz-xxx-bindings-python and uses frequenz-grpcclient-python, module: frequenz.xxx.client)
  • frequenz-xxx-python (high-level interface, including the actor, uses frequenz-xxx-client-python, module: frequenz.xxx)

In general it might look like splitting the -bindings repos is a bit too much fine-grained and it should be put directly in the -client repo, but these repositories, by just containing generated code and always following strictly the -proto-spec/-grpc-pec repos, can be eventually fully automated (each time there is a release of the spec, regenerate the bindings and do a release of this repo automatically), so in the long run it would make the -client less complex (as they won't have to deal with the protoc generated code) and easier to maintain.

Then repo dependencies would look like:

flowchart BT
    grpcclient[frequenz-grpcclient-python]
    common-spec[frequenz-common-proto-spec]
    common-bind[frequenz-common-bindings-python]
    common-wrap[frequenz-common-python]
    xxx-spec[frequenz-xxx-grpc-spec]
    xxx-service[frequenz-xxx-service]
    xxx-bind[frequenz-xxx-bindings-python]
    xxx-client[frequenz-xxx-client-python]
    xxx[frequenz-xxx-python]

    xxx --> xxx-client
    xxx-client -.->|sometimes| common-wrap
    xxx-client --> xxx-bind
    xxx-client --> grpcclient
    xxx-bind --> xxx-spec
    xxx-service --> xxx-spec
    xxx-spec -.->|sometimes| common-spec
    common-wrap --> common-bind
    common-bind --> common-spec