Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting rid of duplicate state variables #116

Open
oxzi opened this issue Oct 25, 2023 · 4 comments
Open

Getting rid of duplicate state variables #116

oxzi opened this issue Oct 25, 2023 · 4 comments

Comments

@oxzi
Copy link
Member

oxzi commented Oct 25, 2023

In this issue I would like to point out the potential problems of the implicit state variables, which are created in the main function and propagated from there throughout the program, and propose a change by using a single state.

State Survey

Brief listing which variables are being generated on startup and being propagated to other components.

There is currently a logical coupling between the identified variables, all holding a central state.
However, there is no central union type and, furthermore, the variables (reference) are just being passed, even multiple times.

cmd/icinga-notifications-daemon/main.go

Those four variables are generated in the main function and are being passed to the listed functions and thereby throughout the whole codebase.

  • conf *config.ConfigFile
    • db := conf.Database.Open
    • listener.NewListener
  • logs *logging.Logging
    • db := conf.Database.Open
    • runtimeConfig := config.NewRuntimeConfig
    • listener.NewListener
  • db *icingadb.DB
    • runtimeConfig := config.NewRuntimeConfig
    • listener.NewListener
  • runtimeConfig *config.RuntimeConfig
    • listener.NewListener

config.RuntimeConfig

There is exactly one RuntimeConfig instance, generated by config.NewRuntimeConfig in the main.go file as listed above.

type RuntimeConfig struct {
    // ConfigSet is the current live config. It is embedded to allow direct access to its members.
    // Accessing it requires a lock that is obtained with RLock() and released with RUnlock().
    ConfigSet
    
    // pending contains changes to config objects that are to be applied to the embedded live config.
    pending ConfigSet
    
    logs   *logging.Logging
    logger *logging.Logger
    db     *icingadb.DB
    
    // mu is used to synchronize access to the live ConfigSet.
    mu sync.RWMutex
}

func NewRuntimeConfig(db *icingadb.DB, logs *logging.Logging) *RuntimeConfig {
    return &RuntimeConfig{db: db, logs: logs, logger: logs.GetChildLogger("runtime-updates")}
}

The RuntimeConfig already holds references to the other three state variables.

listener.Listener

type Listener struct {
	configFile    *config.ConfigFile
	db            *icingadb.DB
	logger        *logging.Logger
	runtimeConfig *config.RuntimeConfig

	logs *logging.Logging
	mux  http.ServeMux
}

func NewListener(db *icingadb.DB, configFile *config.ConfigFile, runtimeConfig *config.RuntimeConfig, logs *logging.Logging) *Listener {
	l := &Listener{
		configFile:    configFile,
		db:            db,
		logger:        logs.GetChildLogger("listener"),
		logs:          logs,
		runtimeConfig: runtimeConfig,
	}
	// . . .
}

With referencing all four state variables, the Listener keeps unnecessary duplicate references.
If the RuntimeConfig fields would be public, the Listener could already drop db and logs.

From there on, the variables will be passed on as the following:

  • The conf variable will be used to either
    • access the ConfigFile.Listen and .DebugPassword field or
    • get an *incident.Incident by passing all state arguments further.
      incident.GetCurrent(ctx, l.db, obj, l.logs.GetChildLogger("incident"), l.runtimeConfig, l.configFile, createIncident)
  • logs is only in use for the same Incident line.
  • db will be used to
    • create an *object.Object from an *event.Event - which itself also keeps a reference -,
      obj, err := object.FromEvent(ctx, l.db, &ev)
    • synchronize this very Event and
      if err := ev.Sync(ctx, tx, l.db, obj.ID); err != nil {
    • also get the current Incident as listed for the conf above.
  • Finally, runtimeConfig is in use
    • for the same Inicident as all other state variables and
    • to dump the config by accessing the inherited ConfigSet struct.
      _ = enc.Encode(&l.runtimeConfig.ConfigSet)

object.Object

Within an object.Object, the db reference is used to create statements.

stmt, _ := object.db.BuildUpsertStmt(&ObjectRow{})
// . . . 
stmt, _ := object.db.BuildInsertStmt(extraTag)

incident.Incident

An *incident.Incident will be created resp. fetched through the GetCurrent function referred multiple times in the listener.Listener section above.
This method stores multiple Incidents based on their *object.Object, each holding all state variables except logs.

While the logs variable is being passed into the incidents.GetCurrent function, it is used there to generate an logging.Logger to be stored for each Incident.

  • The conf variable is read for its .ChannelPluginDir and .Icingaweb2URL values, being passed to a *channel.Channel extracted from the runtimeConfig.
  • The db is also used for statement generation and IncidientRow synchronization.
    err := incidentRow.Sync(ctx, tx, i.db, i.incidentRowID != 0)
  • Finally, runtimeConfig is used to read and alter its stored values.

However, it does not seem like the state variables will be propagated further from this point on.

Suggested Change

First, minimize the amount of state variables to one single variable, holding all information.
As mentioned above, the config.RuntimeConfig can be altered accordingly.

type RuntimeConfig struct {
    // ConfigSet is the current live config. It is embedded to allow direct access to its members.
    // Accessing it requires a lock that is obtained with RLock() and released with RUnlock().
    ConfigSet
    
    // pending contains changes to config objects that are to be applied to the embedded live config.
    pending ConfigSet

    logger *logging.Logger
	
    Conf *ConfigFile
    Logs *logging.Logging
    Db   *icingadb.DB
    
    // mu is used to synchronize access to the live ConfigSet.
    mu sync.RWMutex
}

With this small change to the RuntimeConfig, it can be used as the single state variable in other places.

Next, the hundreds of references to the very same instance should be purged.
As the other packages are already referring to the config package, a dependency does already exist.
Thus, a unique state can be placed there, e.g., by following the singleton pattern.

Both generation and access to the RuntimeConfig might look like the following:

var runtimeConfig *RuntimeConfig

func GetRuntimeConfig() *RuntimeConfig {
	if runtimeConfig == nil {
		// Would be a nil pointer error anyway..
		panic("the singleton RuntimeConfig variable was not yet initialized")
	}
	
	return runtimeConfig
}

func NewRuntimeConfig(conf *ConfigFile, db *icingadb.DB, logs *logging.Logging) (*RuntimeConfig, error) {
	if runtimeConfig != nil {
		return nil, errors.New("a RuntimeConfig does already exist")
	}

	runtimeConfig = &RuntimeConfig{
		Conf:   conf,
		Db:     db,
		Logs:   logs,
		logger: logs.GetChildLogger("runtime-updates"),
	}
	return runtimeConfig, nil
}

This pattern just adds a bit of sugar around a simple global variable.
For testing, however, mocking the RuntimeConfig is still easy as it can simply be altered.

The effects might look, when only changing the listener.Listener for this example's sake, like this:

diff --git a/internal/listener/listener.go b/internal/listener/listener.go
index 98968a3..674a66d 100644
--- a/internal/listener/listener.go
+++ b/internal/listener/listener.go
@@ -10,7 +10,6 @@ import (
 	"github.com/icinga/icinga-notifications/internal/event"
 	"github.com/icinga/icinga-notifications/internal/incident"
 	"github.com/icinga/icinga-notifications/internal/object"
-	"github.com/icinga/icingadb/pkg/icingadb"
 	"github.com/icinga/icingadb/pkg/logging"
 	"go.uber.org/zap"
 	"net/http"
@@ -18,22 +17,14 @@ import (
 )
 
 type Listener struct {
-	configFile    *config.ConfigFile
-	db            *icingadb.DB
-	logger        *logging.Logger
-	runtimeConfig *config.RuntimeConfig
+	logger *logging.Logger
 
-	logs *logging.Logging
-	mux  http.ServeMux
+	mux http.ServeMux
 }
 
-func NewListener(db *icingadb.DB, configFile *config.ConfigFile, runtimeConfig *config.RuntimeConfig, logs *logging.Logging) *Listener {
+func NewListener() *Listener {
 	l := &Listener{
-		configFile:    configFile,
-		db:            db,
-		logger:        logs.GetChildLogger("listener"),
-		logs:          logs,
-		runtimeConfig: runtimeConfig,
+		logger: config.GetRuntimeConfig().Logs.GetChildLogger("listener"),
 	}
 	l.mux.HandleFunc("/process-event", l.ProcessEvent)
 	l.mux.HandleFunc("/dump-config", l.DumpConfig)
@@ -47,8 +38,8 @@ func (l *Listener) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
 }
 
 func (l *Listener) Run() error {
-	l.logger.Infof("Starting listener on http://%s", l.configFile.Listen)
-	return http.ListenAndServe(l.configFile.Listen, l)
+	l.logger.Infof("Starting listener on http://%s", config.GetRuntimeConfig().Conf.Listen)
+	return http.ListenAndServe(config.GetRuntimeConfig().Conf.Listen, l)
 }
 
 func (l *Listener) ProcessEvent(w http.ResponseWriter, req *http.Request) {
@@ -93,7 +84,7 @@ func (l *Listener) ProcessEvent(w http.ResponseWriter, req *http.Request) {
 	}
 
 	ctx := context.Background()
-	obj, err := object.FromEvent(ctx, l.db, &ev)
+	obj, err := object.FromEvent(ctx, config.GetRuntimeConfig().Db, &ev)
 	if err != nil {
 		l.logger.Errorw("Can't sync object", zap.Error(err))
 
@@ -102,7 +93,7 @@ func (l *Listener) ProcessEvent(w http.ResponseWriter, req *http.Request) {
 		return
 	}
 
-	tx, err := l.db.BeginTxx(ctx, nil)
+	tx, err := config.GetRuntimeConfig().Db.BeginTxx(ctx, nil)
 	if err != nil {
 		l.logger.Errorw("Can't start a db transaction", zap.Error(err))
 
@@ -112,7 +103,7 @@ func (l *Listener) ProcessEvent(w http.ResponseWriter, req *http.Request) {
 	}
 	defer func() { _ = tx.Rollback() }()
 
-	if err := ev.Sync(ctx, tx, l.db, obj.ID); err != nil {
+	if err := ev.Sync(ctx, tx, config.GetRuntimeConfig().Db, obj.ID); err != nil {
 		l.logger.Errorw("Failed to insert event and fetch its ID", zap.String("event", ev.String()), zap.Error(err))
 
 		w.WriteHeader(http.StatusInternalServerError)
@@ -121,7 +112,7 @@ func (l *Listener) ProcessEvent(w http.ResponseWriter, req *http.Request) {
 	}
 
 	createIncident := ev.Severity != event.SeverityNone && ev.Severity != event.SeverityOK
-	currentIncident, created, err := incident.GetCurrent(ctx, l.db, obj, l.logs.GetChildLogger("incident"), l.runtimeConfig, l.configFile, createIncident)
+	currentIncident, created, err := incident.GetCurrent(ctx, config.GetRuntimeConfig().Db, obj, config.GetRuntimeConfig().Logs.GetChildLogger("incident"), config.GetRuntimeConfig(), config.GetRuntimeConfig().Conf, createIncident)
 	if err != nil {
 		w.WriteHeader(http.StatusInternalServerError)
 		_, _ = fmt.Fprintln(w, err)
@@ -177,7 +168,7 @@ func (l *Listener) ProcessEvent(w http.ResponseWriter, req *http.Request) {
 // checkDebugPassword checks if the valid debug password was provided. If there is no password configured or the
 // supplied password is incorrect, it sends an error code and returns false. True is returned if access is allowed.
 func (l *Listener) checkDebugPassword(w http.ResponseWriter, r *http.Request) bool {
-	expectedPassword := l.configFile.DebugPassword
+	expectedPassword := config.GetRuntimeConfig().Conf.DebugPassword
 	if expectedPassword == "" {
 		w.WriteHeader(http.StatusForbidden)
 		_, _ = fmt.Fprintln(w, "config dump disables, no debug-password set in config")
@@ -211,7 +202,7 @@ func (l *Listener) DumpConfig(w http.ResponseWriter, r *http.Request) {
 
 	enc := json.NewEncoder(w)
 	enc.SetIndent("", "  ")
-	_ = enc.Encode(&l.runtimeConfig.ConfigSet)
+	_ = enc.Encode(&config.GetRuntimeConfig().ConfigSet)
 }
 
 func (l *Listener) DumpIncidents(w http.ResponseWriter, r *http.Request) {

In a nutshell, all state variables were purged from the struct (might still be passed to other, non updated types).
The config.GetRuntimeConfig() call makes it, imho, easier to signal resp. identify a state change from this package's or type's internal data to some common state.

Implications

While less references to the very same data flies around, still the same data is being modified.
Furthermore, it might be easier to spot the difference between "local" and "shared" data.

@julianbrost
Copy link
Collaborator

Why do you favor using RuntimeConfig over adding a new type just for that purpose? Is it more than it's already there and has all the members?

So far the purpose of RuntimeConfig is to be a copy of the config stored in the database and handle updates of that. For this, it embeds ConfigSet from which it gains a few members that aren't directly obvious from the struct definition and also has methods like for example RLock(). With the proposed change, I think it would be less obvious what belongs together, so I think I'd favor keeping a separate type for the in-memory representation of the config from the database.

In a nutshell, all state variables were purged from the struct (might still be passed to other, non updated types).

Would you keep them that way, i.e. it's passed as a function parameter for that specific function, keep it as a parameter? Or would you want to eliminate all passing around of these values in function parameters?

@oxzi
Copy link
Member Author

oxzi commented Oct 25, 2023

Why do you favor using RuntimeConfig over adding a new type just for that purpose? Is it more than it's already there and has all the members?

Mostly because it is already there and the name is somewhat fitting.

So far the purpose of RuntimeConfig is to be a copy of the config stored in the database and handle updates of that. For this, it embeds ConfigSet from which it gains a few members that aren't directly obvious from the struct definition and also has methods like for example RLock(). With the proposed change, I think it would be less obvious what belongs together, so I think I'd favor keeping a separate type for the in-memory representation of the config from the database.

I would agree with that.
Otherwise, two different things would coincide here, which actually have nothing to do with each other.
Pardon.

In a nutshell, all state variables were purged from the struct (might still be passed to other, non updated types).

Would you keep them that way, i.e. it's passed as a function parameter for that specific function, keep it as a parameter? Or would you want to eliminate all passing around of these values in function parameters?

No, I would strictly enforce this change in the whole code to always signal when the shared state is being used.
In this concrete case, I only did a partial implementation just for the listener.Listener - mostly by replacing the occurrences of the removed state variables with the singleton access method.

This side note was meant to say that in a final implementation the individual state variables are not passed through at redundant places.

-	obj, err := object.FromEvent(ctx, l.db, &ev)
+	obj, err := object.FromEvent(ctx, &ev)

For example, this line would more look like this, as object.FromEvent itself would get the database reference in its method scope.

@julianbrost
Copy link
Collaborator

I just remembered that the object and incident caches are already realized as global variables at the moment.

var (
cache = make(map[string]*Object)
cacheMu sync.Mutex
)

var (
currentIncidents = make(map[*object.Object]*Incident)
currentIncidentsMu sync.Mutex
)

Those should probably also be taken into account as well. So if we realize the suggested approach, this would probably mean moving them to a type with the functions that use those variables and then adding pointers to those instances as well. And my feeling is that by having a central place for all those variables that are used pretty much everywhere, at some point, we'll run into cyclic imports.

So I wouldn't rule out using individual variables near where the actual types are defined (maybe similar to for example DefaultServeMux in net/http).

Given that you already show some diffs, I presume you're trying some things out locally. Have you tried other approaches so far were you maybe ran into problems?

@oxzi
Copy link
Member Author

oxzi commented Oct 26, 2023

I just remembered that the object and incident caches are already realized as global variables at the moment.

var (
cache = make(map[string]*Object)
cacheMu sync.Mutex
)

var (
currentIncidents = make(map[*object.Object]*Incident)
currentIncidentsMu sync.Mutex
)

Those should probably also be taken into account as well. So if we realize the suggested approach, this would probably mean moving them to a type with the functions that use those variables and then adding pointers to those instances as well. And my feeling is that by having a central place for all those variables that are used pretty much everywhere, at some point, we'll run into cyclic imports.

So I wouldn't rule out using individual variables near where the actual types are defined (maybe similar to for example DefaultServeMux in net/http).

I would agree as those global variables are bound to their package's scope and are only being used in one resp. two functions.
They are not used from the outside and are more like C static variables and are not relevant for external callers.

Thus, and also considering cyclic imports, I would only introduce the new proposed behavior for those global or state variables which are being referenced from different packages and are (at least somewhat) derived from the configuration file.

Thanks for bringing up those global variables; I have just checked all other global variables, but found none which I would also extract for the reasons stated above.

Given that you already show some diffs, I presume you're trying some things out locally. Have you tried other approaches so far were you maybe ran into problems?

For one, I played with normal global variables, but didn't find the access signaled clearly enough there.
Imho, a getter method - although technically pointless - shows more clearly that it should also be accessed this way elsewhere.

Likewise, I toyed with a common struct, similar to the one proposed here, but which was passed through.
This makes the data binding more explicit, but the actual problem of the hundred references remains.
I tried to extend this idea by passing a context.Context with explicit state values, but this just felt unnecessary complicated.

Besides that, I looked at how other frameworks do it, e.g. viper.
There you have getter methods on the loaded configuration based on the value's key, which however ends up being similar, albeit more complex, to the suggestion made here.

Looking for other inspiration online, I often found a great focus on "properly" creating the state from one or many configs, but in the end you should draw the end of the owl yourself.
Especially large applications often outsourced the problem in such a way that State is external, e.g. in an etcd, and elsewhere a mixture of passing references and the singleton pattern proposal made here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants