Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start using look-up tables instead of using complex conditionals? #335

Open
IndrajeetPatil opened this issue Dec 21, 2022 · 5 comments
Open

Comments

@IndrajeetPatil
Copy link
Member

After a certain point, a block of conditional statements can become quite difficult to read, maintain, and further extend.

To see what I mean, consider this example from {insight}:

transform_fun <- "exp"

if (transform_fun == "identity") {
  out <- list(transformation = function(x) x, inverse = function(x) x)
} else if (transform_fun == "log") {
  out <- list(transformation = log, inverse = exp)
} else if (transform_fun %in% c("log1p", "log(x+1)")) {
  out <- list(transformation = log1p, inverse = expm1)
} else if (transform_fun == "log10") {
  out <- list(transformation = log10, inverse = function(x) NA)
} else if (transform_fun == "log2") {
  out <- list(transformation = log2, inverse = function(x) NA)
} else if (transform_fun == "exp") {
  out <- list(transformation = exp, inverse = log)
} else if (transform_fun == "sqrt") {
  out <- list(transformation = sqrt, inverse = function(x) x^2)
} else if (transform_fun == "power") {
  out <- list(transformation = function(x) x^2, inverse = sqrt)
} else if (transform_fun == "expm1") {
  out <- list(transformation = expm1, inverse = log1p)
} else if (transform_fun == "log-log") {
  out <- list(
    transformation = function(x) log(log(x)),
    inverse = function(x) exp(exp(x))
  )
}

The alternative here is to create a look-up table, which is much easier to read, and importantly, extend - we just need to add another row for every new transformation:

df <- tibble::tribble(
  ~transform_fun, ~out,
  "identity",     list(transformation = function(x) x, inverse = function(x) x),
  "log",          list(transformation = log, inverse = exp),
  "log1p",        list(transformation = log1p, inverse = expm1),
  "log(x+1)",     list(transformation = log1p, inverse = expm1),
  "log10",        list(transformation = log10, inverse = function(x) NA),
  "log2",         list(transformation = log2, inverse = function(x) NA),
  "exp",          list(transformation = exp, inverse = log),
  "sqrt",         list(transformation = sqrt, inverse = function(x) x^2),
  "power",        list(transformation = function(x) x^2, inverse = sqrt),
  "expm1",        list(transformation = expm1, inverse = log1p),
  "log-log",      list(transformation = function(x) log(log(x)), inverse = function(x) exp(exp(x)))
)

These two approaches, of course, yield the same result:

identical(
  out, 
  df$out[df$transform_fun == transform_fun][[1L]]
)
#> [1] TRUE

Created on 2022-12-21 with reprex v2.0.2

The only complication this introduces is making sure that this data frame is available at build time, which requires collation order (one can use #' @include to this easily).

Should we start using such look-up tables where relevant?

@IndrajeetPatil
Copy link
Member Author

P.S. I am creating a tibble here, but, of course, we can use a vanilla data frame, or even a named vector if it does the trick.

@mattansb
Copy link
Member

Yes, but I personally prefer named vectors / lists:

out_list <- list(
  "identity" = list(transformation = function(x) x, inverse = function(x) x),
  "log" = list(transformation = log, inverse = exp),
  "log1p" = list(transformation = log1p, inverse = expm1),
  "log(x+1)" = list(transformation = log1p, inverse = expm1),
  "log10" = list(transformation = log10, inverse = function(x) NA),
  "log2" = list(transformation = log2, inverse = function(x) NA),
  "exp" = list(transformation = exp, inverse = log),
  "sqrt" = list(transformation = sqrt, inverse = function(x) x^2),
  "power" = list(transformation = function(x) x^2, inverse = sqrt),
  "expm1" = list(transformation = expm1, inverse = log1p),
  "log-log" = list(transformation = function(x) log(log(x)), inverse = function(x) exp(exp(x)))
)

transform_fun <- "exp"

out_list[[transform_fun]]

@IndrajeetPatil
Copy link
Member Author

Me too 😅

typeof(data.frame())
#> [1] "list"

Created on 2022-12-21 with reprex v2.0.2

@bwiernik
Copy link
Contributor

switch() is another option here that I use a lot

@mattansb
Copy link
Member

And it has the added benefit of aliasing:

transform_fun <- "exp"

out <- switch(transform_fun,
  "identity" = list(transformation = function(x) x, inverse = function(x) x),
  "log" = list(transformation = log, inverse = exp),
  "log1p" = ,                                                                # ALIAS
  "log(x+1)" = list(transformation = log1p, inverse = expm1),
  "log10" = list(transformation = log10, inverse = function(x) NA),
  "log2" = list(transformation = log2, inverse = function(x) NA),
  "exp" = list(transformation = exp, inverse = log),
  "sqrt" = list(transformation = sqrt, inverse = function(x) x^2),
  "power" = list(transformation = function(x) x^2, inverse = sqrt),
  "expm1" = list(transformation = expm1, inverse = log1p),
  "log-log" = list(transformation = function(x) log(log(x)), inverse = function(x) exp(exp(x)))
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants