This is a repository for Factorized Machine Self-Confidence (FaMSeC). FaMSeC is a "self-assessment" algorithmic assurance (see this paper), and is meant to effect the how people interact with an autonomous system. The code here produces data that was used in this experiment. This code also, produced data for this paper. It is based around the idea of performing meta-analysis of a decision-making agent. This repository implements and ; other metrics have not yet been developed.
The overall goal of the code here is to : 1) generate "road-network" MDPs, and simulate the performance of different solvers on them; 2) calculate two of the FaMSeC metrics---"Solver Quality" (), and "Outcome Assessment" (); 3) provide scripts for plotting different data and investigating the different properties of . This is an example road-network:
Note: This repository is basically taken from the dev
branch of the self_confidence repository. I have removed unused files, added documentation, and a few other things like that. The reason I did this was that this code stands alone from the rest of the self_confidence
repository. Also, doing this makes a clean separation from my work on and the MTurk experiment involving both and , verses the work done on the other repository which was focused only on development of
The main code uses Julia v0.6.3
with the following packages:
POMDPs.jl
(v0.6.9
) andPOMDPToolbox
(v0.2.8
) --- MDP and POMDP functionalityMetaGraphs.jl
(v0.4.1
) --- for adding metadata toLightGraphs.jl
graphsJSON.jl
(v0.17.2
) --- for reading/exporting.json
filesPyPlot.jl
(v2.6.3
) --- for creating plots with matplotlib (this requires matplotlib to be installed, but PyPlot takes care of this using an anaconda environment)JLD
(v0.8.3
)/FileIO
(v0.9.1
) --- for binary file storageProgressMeter
(v0.5.6
) --- for showing progress bars when runningMicroLogging
(v0.2.0
) --- nice logging utilityDistributions
(v0.15.0
) --- for working with random variables and distributionsStatsBase
(v0.23.1
) --- basic statistics stuffTikzGraphs
(v0.6.0
)/TikzPictures
(v1.2.0
) --- for plotting graphs using TikzDataStructures
(v0.8.4
) --- adds data structures (linked lists, queues, etc.)DataFrames
(v0.11.7
)/CSV
(v0.2.5
)--- dealing with reading and writing data in tables
The code for selecting the data set to be used in the experiment uses Julia v1.0.0
(yes, I know this is very unfortunate, but that is how it is, and making either version compatible with the other isn't trivial, so I'm leaving it as-is)
LatinHypercubeSampling
(v1.2.0
)---for sampling the xQ/xO spaceJSON
(v0.20.0
)---reading/writing.json
filesPyPlot
(v2.7.0
)---plottingStatsBase
(v0.27.0
)---basic statistics stuffDataFrames
(v0.17.0
)/CSV
(v0.4.3
)---dealing with reading and writing data in tablesTexTables
(v0.1.0
)---making a nice summary table for publishing
When creating data "from scratch" this is the process that I followed; read the instructions carefully first, the end of step 1 is critical:
- Run
make_nets_and_data.jl
, after adding theexperiment_name
to theexperiment_utilities.jl
file, and creating a corresponding experiment parameters file in theexperiment_params
folder, and following the format of other files in that folder. I typically did this on Google Cloud Platform using something like a 64 processor machine and a few hundred workers in julia (viajulia -p <number_of_workers>
). This process results in.jld
and.csv
files in thelogs
directory. The.jld
files contain the raw network definitions and simulations, the.csv
files contain summary data used for creating the solver quality model.
- NOTE: After running the code you need to shut down the REPL before starting it up again for the step 2. This is due to an issue with the
.jld
library, it is unfortunate, but exists. Failing to do this can ruin the data that you just made (possibly ruining hours of processing)
-
Run
make_experiment_data.jl
. This file can do three things:- Create the
.svg
figures for the experiment---this is done in theimgs
folder by default. - Calculate xQ for each network, this will produce the SQ model if necessary (which will be necessary if you are starting from scratch)---this data is written into the relevant, existing,
.jld
file. - Export data to a
.json
file that can be used in the MTurk experiment---these files are saved in thelogs
folder by default.
Any combination of these three tasks can be run at a time. All of them need to be run to have the complete set of data for an experiment, but you can skip one step or another if you are already happy with the data that has been produced (this came in handy when I was debugging the
.json
export, and ). - Create the
-
Run
svg_resize.jl
to "square-up" the.svg
figures. I typically do this like so: from the root project directory runjulia svg_resize.jl
. -
Using Julia
v1.0.0
runmake_experiment_dataset.jl
in theexperiment_analysis
folder---this helps select a small subset from a (fairly) large set of generated networks. It operates on the.json
files created in step 2. This code produces two plots, and creates files that can be saved into the PsiTurk experiment directory.There is one pretty unfortunate part about this code: it runs on Julia
v1
instead of Juliav0.6
like the previous code. This cannot be helped because of the necessary libraries, if it were simple I'd convert the other code fromv0.6
, but that isn't a priority right now.Having said that, you could select the dataset "by hand", but this code provides a principled way of doing that.
Note: Other make_*
files are, or were at some point, self-sufficient but were wrapped into the above two files over time. The plot_*
files are run to make specialty plots for different papers.
We used X3
and X4
in code because that was the original self-confidence notation. Later we changed the notation to xQ
and xO
, which is easier to follow, but we haven't replaced the old notation in the code yet...hasn't been a priority
Following are some high-level descriptions of the different files. Small README.md
files are found within each directory to indicate their use.
-
calc_xq.jl
---calculates the solver qualityxQ
value, used bymake_experiment_data.jl
-
experiment_utilities.jl
---some basic utilities used bymake_nets_and_data.jl
-
hellinger_test.jl
---file for investigating the properties and behavior of the Hellinger metric -
juliarc.jl
---file that adds the current path to the julia environment so local modules can be loaded withusing
command -
LICENSE
---MIT license file -
logistic_tests
---file for investigating the properties of the general logistic function -
make_experiment_data.jl
---After all networks have been created and simulations run, this code runs to make the data for the MTurk experiment. This means making the figures, calculating xQ (which has to be done after everything because we need to have the surrogate model), filtering out networks that are too dense or where the truck is too close to the exit, and simulating success/failures of deliveries on networks. This data is used for the MTurk experiment. -
make_nets_and_data.jl
---This file is the main file to produce the simulation data. Here the networks are created and many simulations are run. I ran this file on Google Cloud Platform so I could utilize many parallel processes. Otherwise this might take a really long time. -
make_roadnet_figs.jl
---Called bymake_experiment_data.jl
to produce the.svg
figures of the different road networks -
make_SQ_model.jl
--- -
make_table_corr_plot.jl
---file used to make correlation plot of log files (found inlogs
folder). This is to help identify what variables are interacting to try and decide what variable to include in the surrogate model -
network_library.jl
---file used to create different kinds of networks. The "original" road-network, and a "medium" network are here. Also code to make "random" networks that were used in the final experiments. Also some code to visualize a network. These aren't just "standard" networks, they specify things like exit nodes, and reward structures that are used in the MDP. -
plot_mcts_depth.jl
---make box plots based on MCTS depth of the different solvers. This made figures in numerical simulations report that is on ArXiv -
plot_root_comparison.jl
---Used to plot different fractional exponents when thinking about the$\alpha$ parameter of xQ -
plot_rwd_dists.jl
---file used inmake_experiment_data.jl
to plot thesurprisesuccess
andsurprisefailure
figures. -
prepend_preamble.tex
---this file is used by theTikzGraphs
library in order to add special characters when making the road-network images. This enables the truck and motorcycle icons to be displayed -
Roadnet_MDP.jl
,roadnet_pursuer_generator_MDP
---definitions for the "roadnet" MDP. That is an autonomous vehicle trying to reach a destination on a road-network while avoiding a pursuer. -
road_net_visualize.jl
---make an animation of a simulated run. -
self_confidence.jl
---code for calculating bothxO
(X4
) andxQ
(X3
) -
send_mail.jl
---used for sending email whenmake_nets_and_data.jl
is done running on Google Cloud Platform. -
SQ_investigation.jl
---I don't think this file is used, but I haven't taken the time to check.... -
svg_resize.jl
---processes images in theimgs
folder to make their aspect ratio square, save figures inimgs/squared
folder -
test_mxnet.jl
---script to make sure mxnet is working alright -
test_pomdp_parallel.jl
---script to test parallel pomdps -
utilities.jl
---some useful functions that I used in places, I don't think all of them are still used though, and some may not be finished. I think I abandoned them in the middle for a different approach (i.e. lhs). -
visualize_medium_net.jl
---make plots of original and medium roadnets -
X3_empirical.jl
---code for original development of X3 -
X3_test.jl
---code to produce more numerical simulations of X3 in action. This figure was used in several papers, and shows two GPs that cross over in different locations. The value of X3 is compared at different locations and for different "global reward ranges". -
make_experiment_dataset.jl
---make a plot of the total data set available, create a subset of the data to be used for the PsiTurk experiment -
consistent_agent_utils.jl
---helper functions formake_experiment_dataset.jl
-
spread_sample.jl
---code for sub-sampling the full dataset, this is used bymake_experiment_dataset.jl