Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.
/ snabbkaffe Public archive

Collection of utilities for trace-based testing

License

Notifications You must be signed in to change notification settings

klarna/snabbkaffe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Snabbkaffe

https://travis-ci.org/klarna/snabbkaffe.svg?branch=master

Introduction

This library provides functions for simple trace-based testing.

It works like this:

  1. Programmer manually instruments the code with trace points
  2. Testcases are split in two parts:
    • Run stage where the program runs and emits event trace
    • Check stage where trace is collected and validated against the spec(s)
  3. Trace points become ordinary log messages in the release build

This approach can be used in a component test involving an ensemble of interacting processes. It has a few nice properties:

  • Checks can be separated from the program execution
  • Checks are independent from each other and fully composable
  • Trace contains complete history of the process execution, thus making certain types of concurrency bugs, like livelocks, easier to detect

Usage

Instrumenting the code

Code instrumentation is done manually by inserting tp macros at the points of interest:

?tp(my_server_state_change, #{old_state => foo, next_state => bar})

The above line of code, when compiled in test mode, emits an event of kind my_server_state_change, with some additional data specified in the second argument. Any event has a kind, which is an atom identifying type of the event. The second argument must be a map, and it can contain any data.

In the release build this macro will become a hut log message with debug level. One can as well tune trace point log level:

?tp(notice, my_server_state_change, #{old_state => foo, next_state => bar})

Collecting trace

Trace collector process must be started before running the test. Full workflow looks like this:

ok = snabbkaffe:start_trace(),
Return = RunYourCode(),
Trace = snabbkaffe:collect_trace(Timeout),
snabbkaffe:stop(),
RunCheck1(Return, Trace),
RunCheck2(Return, Trace),
...
RunCheckN(Return, Trace).

Note that collect_trace function is destructive: it cleans event buffer of the collector process. Its argument Timeout specifies how long the call will wait after the last event is received. Setting this parameter to a non-zero value is useful when run stage starts some asynchronous tasks. By default Timeout is 0.

Checking traces

Trace is just a list of maps. Any standard function can work with it. Nonetheless, Snabbkaffe comes with a few useful macros for trace analysis.

Select events of certain kind:

?of_kind(foo, Trace)

Extract values of a certain event field(s) to a list:

[1, 2] = ?projection(foo, [#{foo => 1, quux => 1}, #{foo => 2}]),

[{1, 1}, {2, 2}] = ?projection([foo, bar], [#{foo => 1, bar => 1}, #{foo => 2, bar => 2}])

Check that events occur in certain order, and throw an exception otherwise:

?causality( #{?snk_kind := foo, foo := _A}
          , #{?snk_kind := bar, foo := _A}
          , Trace
          )

First argument of ?causality macro matches an event that is called the cause, and the second one matches so called effect. The above example checks events of kind bar occur only after corresponding events of kind foo.

This macro can be extended with a guard as well. Here guard checks that two events actually make up a pair:

?causality( #{?snk_kind := foo, foo := _A}
          , #{?snk_kind := bar, foo := _B}
          , _A + 1 =:= _B
          , Trace
          )

There is a version of the above macro that checks that all cause have an effect:

?strict_causality( #{?snk_kind := foo, foo := _A}
                 , #{?snk_kind := bar, foo := _A}
                 , Trace
                 )

Otherwise it works just like ?causality.

Both ?causality and ?strict_causality are actually based on a more powerful ?find_pairs macro that is invoked like this:

?find_pairs( Strict
           , MatchCause
           , MatchEffect
          [, Guard]
           , Trace
           )

where Strict is a boolean that determines whether events that matched as Effect may precede their cause.

It returns a list of tuples of type snabbkaffe:maybe_pair/0 that is defined like that:

-type maybe_pair() :: {pair, event(), event()}
                    | {singleton, event()}.

Gathering it all together

?check_trace is a convenience wrapper that starts the trace collector process, executes run stage, collects traces and then executes check stage:

?check_trace(begin
               RunStage
             end,
             fun(ReturnValue, Trace) ->
                CheckStage
             end).

There is an extended version of this macro that takes additional configuration as the first argument:

?check_trace(#{timeout => Timeout, bucket => Bucket},
             begin
               RunStage
             end,
             fun(ReturnValue, Trace) ->
                CheckStage
             end).

or:

?check_trace(Bucket,
             begin
               RunStage
             end,
             fun(ReturnValue, Trace) ->
                CheckStage
             end).

Bucket is a parameter used for benchmarking, more on that later.

Blocking execution of testcase until certain event is emitted

Even though philosophy of this library lies in separation of run and verify stages, sometimes the former needs to be somewhat aware of the events. For example, the testcase may need to wait for asynchronous initialization of some resource.

In this case ?block_until macro should be used. It allows the testcase to peek into the trace. Example usage:

?block_until(#{?snk_kind := Kind}, Timeout, BackInTime)

Note: it’s tempting to use this macro to check the result of some asynchronous action, like this:

{ok, Pid} = foo:async_init(),
{ok, Event} = ?block_until(#{?snk_kind := foo_init, pid := Pid}),
do_stuff(Pid)

However it’s not a good idea, because the event can be emitted before ?block_until has a chance to run. Use the following macro to avoid this race condition:

{{ok, Pid}, {ok, Event}} = ?wait_async_action( foo:async_init()
                                             , #{?snk_kind := foo_init, pid := Pid}
                                             ),
do_stuff(Pid)

Declarative fault injection

Any trace point can also be used to inject crashes into the system. This is extremely useful for testing fault-tolerance properties of the system and tuning the supervision trees. This is done using ?inject_crash macro, like in the below example:

FaultId = ?inject_crash( #{?snk_kind := some_kind, value := 42} % Pattern for matching trace points
                       , snabbkaffe_nemesis:always_crash()      % Fault scenario
                       , notmyday                               % Error reason
                       )

Running this command in the run stage of the testcase will ensure that every time the system tries to emit a trace event matching the pattern, the system will crash with a reason notmyday, and emit a trace event of kind snabbkaffe_crash.

First argument of the macro is a pattern that is used for matching trace events. Second argument is a “fault scenario”, that determines how often the system should fail. The following scenarios are implemented:

  • snabbkaffe_nemesis:always_crash() – always crash, emulates unrecoverable errors
  • snabbkaffe_nemesis:recover_after(N) – crash N times, and then proceed normally, emulates recoverable errors
  • snabbkaffe_nemesis:random_crash(P) – crash in a pseudo-random pattern with probability P, emulates an unreliable resource
  • snabbkaffe_nemesis:periodic_crash(Period, DutyCycle, Phase) – crash periodically, like this: [ok, ok, ok, crash, crash, ok, ok, ok, crash, crash|...]
    • Period is an integer that specifies period of the crash-recover cycle
    • DutyCycle is a float in [0..1] range, that specifies relative amount of time when the trace point is not crushing. (For example, 1 means the system doesn’t crash, and 0 means it always crashes)
    • Phase is a float in [0..2*math:pi()] range that allows to shift the phase of the periodic scenario

Finally, the third argument is a crash reason. It is optional, and defaults to the atom notmyday.

Please note that fault scenarios work independently for each trace point. E.g. if there are two trace point that both match the same fault injection pattern with recover_after scenario, they will recover at different times.

Later snabbkaffe_nemesis:fix_crash(FaultId) call can be used to delete the injected crash.

PropER integration

There are two useful macros for running snabbkaffe together with propER:

Config = [{proper, #{ numtests => 100
                    , timeout  => 5000
                    , max_size => 100
                    }}, ...],
?run_prop(Config, PROP)

Config parameter should be a proplist or a map, that (optionally) contains proper key. It can be used to pass different parameters to proper. Snabbkaffe will fall back to the default values (shown above) when parameter is absent.

PROP is a proper spec that looks something like this:

?FORALL({Ret, L}, {term(), list()},
   ?check_trace(
      begin
        RunStage
      end,
      fun(Return, Trace) ->
        CheckStage
      end))

There is another macro for the most common type of proper checks where property is a simple ?FORALL clause (like in the above example).

?forall_trace({Ret, L}, {term(), list()},
              begin
                RunStage
              end,
              fun(Return, Trace) ->
                CheckStage
              end)

It combines ?FORALL and ?run_prop.

Concuerror support

Snabbkaffe has (highly) experimental support for Concuerror. It requires recompiling this library with special options, so creating a special build profile is recommended. This can be done by adding the following code to the rebar.config:

{profiles,
 [ {concuerror,
    [ {overrides,
       [{add, [{erl_opts,
                [ {d, 'CONCUERROR'}
                , {d, 'HUT_IOFORMAT'}
                ]}]}]}
    ]}
 ]}.

Run concuerror with the following flags:

$(CONCUERROR) --treat_as_normal shutdown --treat_as_normal normal \
              -x code -x code_server -x error_handler \
              --pa $(BUILD_DIR)/concuerror+test/lib/snabbkaffe/ebin

P.S. Again, this feature is experimental, use at your own risk.

Benchmarking

Snabbkaffe automatically adds timestamps to the events, which makes it a very unscientific benchmarking library.

There is a family of functions for reporting metric data.

Report a scalar metric called my_metric1:

snabbkaffe:push_stat(my_metric1, 42),
snabbkaffe:push_stats(my_metric1, [42, 43, 42]),
%% Or even:
snabbkaffe:push_stats(my_metric1, [{pair, Event1, Event2}, {pair, Event3, Event4}, ...]),

Sometimes it is entertaining to see how metric value depends on the size of the input data:

snabbkaffe:push_stat(my_metric1, SizeOfData, 42),
snabbkaffe:push_stats(my_metric1, SizeOfData, [42, 43, 42])

Metrics can be reported by calling snabbkaffe:analyze_statistics/0 function that prints statistics for each reported metric, like in the above example:

-------------------------------
foo_bar statistics:
[{min,9.999999999999999e-6},
 {max,9.999999999999999e-6},
 {arithmetic_mean,1.000000000000002e-5},
 {geometric_mean,1.0000000000000123e-5},
 {harmonic_mean,9.999999999999997e-6},
 {median,9.999999999999999e-6},
 {variance,4.174340734454146e-40},
 {standard_deviation,2.0431203426264804e-20},
 {skewness,-0.9850375627355535},
 {kurtosis,-2.0199000000000003},
 {percentile,[{50,9.999999999999999e-6},
              {75,9.999999999999999e-6},
              {90,9.999999999999999e-6},
              {95,9.999999999999999e-6},
              {99,9.999999999999999e-6},
              {999,9.999999999999999e-6}]},
 {histogram,[{9.999999999999999e-6,100}]},
 {n,100}]

Statisitics of test
100.479087 ^                                                                     *
           |                                                               *
           |                                                        *
           |                                                  *
           |
           |                                            *
           |                                     *
           |
           |                               *
           |                         *
           |                   *
           |
           |            *
           |      *
         0 +--------------------------------------------------------------------->
           0                                                                  1100

         N    min         max        avg
       110 1.23984e+0 1.09774e+1 5.97581e+0
       209 1.10121e+1 2.08884e+1 1.60011e+1
       308 2.13004e+1 3.09071e+1 2.60224e+1
       407 3.10212e+1 4.09074e+1 3.59904e+1
       506 4.10095e+1 5.09904e+1 4.60456e+1
       605 5.11370e+1 6.08557e+1 5.60354e+1
       704 6.10493e+1 7.09071e+1 6.59642e+1
       803 7.11237e+1 8.07733e+1 7.59588e+1
       902 8.10944e+1 9.09766e+1 8.60179e+1
      1001 9.10459e+1 9.99404e+1 9.54548e+1
      1100 1.00004e+2 1.00939e+2 1.00479e+2

Note: ?run_prop does this automatically.