Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behavior between config:get* accessors during cache failures #4890

Open
chewbranca opened this issue Dec 2, 2023 · 1 comment · May be fixed by #4894
Open

Inconsistent behavior between config:get* accessors during cache failures #4890

chewbranca opened this issue Dec 2, 2023 · 1 comment · May be fixed by #4894

Comments

@chewbranca
Copy link
Contributor

Description

There are two subsets of config:get* accessors: 1) baseline get in functions config:get/{1,2,3}; 2) three wrapper functions around the baseline get for type coercion: get_integer/3, get_boolean/3, and get_float/3.

The bug here is that the baseline get accessor fails with an ets exception in the event the ets table does not exist, crashing the caller and not providing any config value, whereas using the secondary category of type coercion wrappers catches the exception out of the baseline get call and then returns the default value, rather than allowing the exception to bubble up.

The implication of this is that live config lookups can suddenly revert to default values in the event the proper set of values stored in the ets table becomes unavailable. This is obviously not good.

In practice, config is pretty reliable at this point, and I only stumbled upon it while debugging eunit tests where we're constantly spinning up and tearing down subsets of CouchDB applications, resulting in numerous instances of config lookup crashes during teardown when there were still live requests but the ets table did not exist. So I don't think this is commonly occurring today, but the type coercion functions landed a decade ago [1], and I've personally run exit(whereis(config)) on a number of occasions, so I know this happened a few times at least.

This issue is further complicated by the fact that the try ... catch error:badarg ... end logic in the typed accessors is at least triple overloaded. The core pattern for all three functions is the following, with s/integer/float/g and s/integer/boolean/g for get_integer/3 and get_boolean/3, respectively:

get_integer(Section, Key, Default) when is_integer(Default) ->
    try
        to_integer(get(Section, Key, Default))
    catch
        error:badarg ->
            Default
    end.

The three scenarios I'm aware of that can trigger the catch statement resulting in the Default value flowing through are:

  1. it's not the appropriate data type, so conversion functions like list_to_integer throw {error, badarg}.
  2. baseline get/{1,2,3} throws {error, badarg} when the ets table doesn't exist
  3. get/3 throws {error, badarg} itself when the ets:lookup succeeds with zero results and default is a bad data type [2]

Steps to Reproduce

This can be done independently of CouchDB with the raw config.beam file:

chewbranca@Russells-MacBook-Pro-2 db % erl -pa src/config/ebin
Erlang/OTP 25 [erts-13.2.2.3] [source] [64-bit] [smp:10:10] [ds:10:10:10] [async-threads:1]

Eshell V13.2.2.3  (abort with ^G)
1> config:get_boolean("foo", "bar", false).
false
2> config:get("foo", "bar", "true").
** exception error: bad argument
     in function  ets:lookup/2
        called as ets:lookup(config,{"foo","bar"})
        *** argument 1: the table identifier does not refer to an existing ETS table
     in call from config:get/3 (src/config.erl, line 161)
3>

Expected Behaviour

I don't think there's ever a scenario where we want the user provided settings in default.ini and local.ini to be ignored because the cache disappeared resulting in reverting to unexpected default values. I think we should always crash when the ets table doesn't exist for normal database operations.

I do however think we should wire up the eunit test suite such that we can toggle off use of the ini provide config values and force use of all default values in the code base. We could achieve this by way of something like meck'ing out config:get/3 as fun(_Key, _Section, Default) -> Default end.

On a positive note, it's fairly easy to induce that behavior by way of moving the call to get/3 in each of the three typed accessors outside of the try-catch block such that we allow lookup exceptions but catch data type exceptions.

The following diff:

diff --git a/src/config/src/config.erl b/src/config/src/config.erl
index 3ece5326b..30c666bb2 100644
--- a/src/config/src/config.erl
+++ b/src/config/src/config.erl
@@ -68,8 +68,9 @@ all() ->
     lists:sort(gen_server:call(?MODULE, all, infinity)).
 
 get_integer(Section, Key, Default) when is_integer(Default) ->
+    Val = get(Section, Key, Default),
     try
-        to_integer(get(Section, Key, Default))
+        to_integer(Val)
     catch
         error:badarg ->
             Default
@@ -91,8 +92,9 @@ to_integer(Bin) when is_binary(Bin) ->
     list_to_integer(binary_to_list(Bin)).
 
 get_float(Section, Key, Default) when is_float(Default) ->
+    Val = get(Section, Key, Default),
     try
-        to_float(get(Section, Key, Default))
+        to_float(Val)
     catch
         error:badarg ->
             Default
@@ -116,8 +118,9 @@ to_float(Bin) when is_binary(Bin) ->
     list_to_float(binary_to_list(Bin)).
 
 get_boolean(Section, Key, Default) when is_boolean(Default) ->
+    Val = get(Section, Key, Default),
     try
-        to_boolean(get(Section, Key, Default))
+        to_boolean(Val)
     catch
         error:badarg ->
             Default

gives the desired behavior:

chewbranca@Russells-MBP-2 couchdb % erl -pa src/config/ebin      
Erlang/OTP 25 [erts-13.2.2.3] [source] [64-bit] [smp:10:10] [ds:10:10:10] [async-threads:1]

Eshell V13.2.2.3  (abort with ^G)
1> config:get_boolean("foo", "bar", false).
** exception error: bad argument
     in function  ets:lookup/2
        called as ets:lookup(config,{"foo","bar"})
        *** argument 1: the table identifier does not refer to an existing ETS table
     in call from config:get/3 (src/config.erl, line 163)
     in call from config:get_boolean/3 (src/config.erl, line 121)
2> config:get("foo", "bar", "true").       
** exception error: bad argument
     in function  ets:lookup/2
        called as ets:lookup(config,{"foo","bar"})
        *** argument 1: the table identifier does not refer to an existing ETS table
     in call from config:get/3 (src/config.erl, line 163)
3> %                                                                                                                                                                                                                                                 

If folks think that's a reasonable approach I can move that over to a PR.

Additional Context

[1] apache/couchdb-config@6859d11
[2] https://github.com/apache/couchdb-config/blob/master/src/config.erl#L167

@rnewson
Copy link
Member

rnewson commented Dec 2, 2023

That's a neat fix for a real problem. PR please.

@chewbranca chewbranca linked a pull request Dec 4, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants