Continuous Certification of Non-Functional Properties Across System Changes

Existing certification schemes implement continuous verification techniques aimed to prove non-functional (e.g., security) properties of software systems over time. These schemes provide different re-certification techniques for managing the certificate life cycle, though their strong assumptions make them ineffective against modern service-based distributed systems. Re-certification techniques are in fact built on static system models, which do not properly represent the system evolution, and on static detection of system changes, which results in an inaccurate planning of re-certification activities. In this paper, we propose a continuous certification scheme that departs from a static certificate life cycle management and provides a dynamic approach built on the modeling of the system behavior that reduces the amount of unnecessary re-certification. The quality of the proposed scheme is experimentally evaluated using an ad hoc dataset built on publicly-available datasets.

This repository contains the source code, input dataset, and detailed results of our experimental evaluation.

1. Overview
1. Organization
- 2.1. Details: Code Organization
- 2.2. Details: Input
- 2.3. Details: Output
1. Example Execution
1. Appendix of the Paper

1. Overview

The code is written in Python 3 and tested in an MacOS environment (virtualenv) with Python 3.10; dependencies are listed in requirements.txt.

The aim of our experimental evaluation is to compare our scheme with a scheme representing the state of the art, covering all the scenarios as described in our paper. For this reason, our experimental data are based on the dataset available at https://doi.org/10.13012/B2IDB-6738796_V1. It measures the response time of a set of microservices along a given execution path in normal and anomalous conditions. Specifically, the creators of the dataset executed three well-known distributed systems in the literature with and without injecting anomalies (for details, see the corresponding paper: https://www.usenix.org/conference/osdi20/presentation/qiu).

We defined our experimental settings extracting normal and anomalous data from the above data, to generate a dataset including environmental changes and code changes with and without impact on the behavior. Each data point of the dataset is also annotated with additional information (e.g., presence of critical components affected by the change). We then apply our scheme and the state of the art scheme on the generated dataset.

Each row of the table below represents an experimental settings driving the generation of our datasets. The process of dataset generation and schemes application have been repeated 10 times for each distributed system and experimental settings. The original paper and Supp1-Exp_Settings.pdf contain details on the meaning of each columns.

Name	$\Delta_b$	$\Delta_c$	$\Delta_c$ with cascading	non critical	minor	$n(comp)_b$	$n(comp)_{\text{min}}$	$n(comp)_{\text{maj}}$
P1.1	$0.3\overline{3}$	$ 0.3\overline{3}$	$ 0.3\overline{3}$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$
P1.2	$0.3\overline{3}$	$ 0.3\overline{3}$	$ 0.3\overline{3}$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$
P1.3	$0.3\overline{3}$	$ 0.3\overline{3}$	$ 0.3\overline{3}$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$
P2.1	$0.5$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$
P2.2	$0.5$	$ 0.25$	$ 0.25$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$
P2.3	$0.5$	$ 0.25$	$ 0.25$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$
P3.1	$0.25$	$ 0.5$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$
P3.2	$0.25$	$ 0.5$	$ 0.25$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$
P3.3	$0.25$	$ 0.5$	$ 0.25$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$
P4.1	$0.25$	$ 0.25$	$ 0.5$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$	$ 0.25$
P4.2	$0.25$	$ 0.25$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$	$ 0.5$
P4.3	$0.25$	$ 0.25$	$ 0.5$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$	$ 0.75$

2. Organization

The repository is organized in the following directories:

Code: contains the Python code to run the experiments.
Input: contains the raw data (response times) of the microservices, taken from https://doi.org/10.13012/B2IDB-6738796_V1.
Output: contains the results of our experiments, including generated datasets, with results aggregated at different levels.
input.json: contains the actual input to run the code and reproduce our results.

2.1. Details: Code Organization

The code consists of the following files:

Code/base.py: contains utility classes and functions used in the rest of the code, including the code to train the system model based on isolation forest.
Code/const.py: contains constants.
Code/dataset_generator.py: contains the code to generate the experimental dataset starting from data in the provided input directory.
Code/entrypoint.py: contains the main entrypoint
Code/exp_quality_scheme.py: contains the code running the core function applying our scheme and the state of the art on a set of experimental settings and evaluating the results
Code/exp_quality_scheme_support.py: contains support code, including code to export data
Code/situation.py: contains the code selecting the scenario that applies in each detected change

Each file has its own set of tests executable with pytest.

2.2. Details: Input

The code requires two inputs: initial data from https://doi.org/10.13012/B2IDB-6738796_V1, and experimental settings as reported in the table above.

Initial data: the code works on the data as is, as long as each distributed system has its own directory. Experimental settings: they need to be provided as a json file; this file contains the different probabilities as well as path to input data. This repository contains the input data we used during our experiments, therefore paths to be adjusted properly.

2.3. Details: Output

The code produces aggregated and detailed results in two formats: excel and csv. The two formats are always generated, that is, there is no option to choose the desired format.

During execution, the code requires the base directory where output data should be placed. With the hope that they may be useful, we include all our data: our results as well as generated datasets.

In particular, we generate two types of datasets:

one dataset for each distributed system used to train system model based on isolation forest, to detect anomalies later. Basically, it is merge of the individual csv files in https://doi.org/10.13012/B2IDB-6738796_V1; located at directory Output/DatasetsTraining.
10 datasets for each distributed system and experimental setting, where we apply our scheme as well as the state of the art; located at Output/Detailed/system-name/config-name/N_dataset, with N in [0, 9].

More in detail, the generated output is contained in the following directories.

Output/Aggregated_Configs: contains data aggregated over the three distributed systems, reporting results for each experimental setting.
Output/Aggregated_ConfigsGroup: contains data aggregated over the groups of experimental settings (e.g., P1 is a group, P2 is a group, and so on) reporting separated results for each distributed system.
Output/Aggregated_ConfigsGroup_Datasets: contains data aggregated over the groups of experimental settings (e.g., P1 is a group, P2 is a group, and so on) and each distributed system.
Output/Aggregated_Datasets: contains data aggregated over each distributed system.
Output/Aggregated_Datasets: contains data aggregated at the highest level possible.
Output/DatasetsTraining: contains the dataset generated to train the system model based on isolation forest.
Output/Detailed: contains detailed non-aggregated results.

Individual files have a pretty self-explanatory names, those containing scheme contains data evaluating the two certification schemes, those containing model contains data evaluating the system model.

Data in our paper are created mostly from Output/Aggregated_Datasets/by_config_scheme_stripped.xlsx andOutput/Aggregated_Datasets/by_dataset_model_stripped.xlsx for what concerns system model.

Note: our experiments have been executed with option --include-strip-down True, this means that for each output file there exists a stripped down version where only columns including averaged data are reported. This also means that some files are basically empty because this filter removes all data. We nevertheless decided to include these files, since we run our experiments including this option.

3. Example Execution

To reproduce our results you need to:

modify input.json according to the path where you placed directory Input
run the following command, replacing path-to-input.json and path-to-output with the desired paths.

python entrypoint.py \
	--config-file-name path-to-inputv.json \
	--output-directory path-to-output \
	--include-strip-down True

Execution is parallelized and experiments should last some minutes.

4. Appendix of the Paper

We provide two supplements of our paper.

Supp1-Exp_Settings.pdf contains a detailed description of the experimental process extending its discussion in the paper.
Supp2-Walkthrough.pdf contains a detailed walkthrough of the certification scheme presented in the paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code

Code

Input

Input

Output

Output

.gitattributes

.gitattributes

.gitignore

.gitignore

Supp1-Exp_Settings.pdf

Supp1-Exp_Settings.pdf

Supp2-Walkthrough.pdf

Supp2-Walkthrough.pdf

input.json

input.json

readme.md

readme.md

Repository files navigation

Continuous Certification of Non-Functional Properties Across System Changes

1. Overview

2. Organization

2.1. Details: Code Organization

2.2. Details: Input

2.3. Details: Output

3. Example Execution

4. Appendix of the Paper

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Code		Code
Input		Input
Output		Output
.gitattributes		.gitattributes
.gitignore		.gitignore
Supp1-Exp_Settings.pdf		Supp1-Exp_Settings.pdf
Supp2-Walkthrough.pdf		Supp2-Walkthrough.pdf
input.json		input.json
readme.md		readme.md

SESARLab/certification-across-system-changes

Folders and files

Latest commit

History

Repository files navigation

Continuous Certification of Non-Functional Properties Across System Changes

1. Overview

2. Organization

2.1. Details: Code Organization

2.2. Details: Input

2.3. Details: Output

3. Example Execution

4. Appendix of the Paper

About

Resources

Stars

Watchers

Forks

Languages