Skip to content

Cross-validation of Keras and scikit-learn models with the Faculty platform

Notifications You must be signed in to change notification settings

facultyai/faculty-xval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

img|small

faculty-xval

Cross validation of machine-learning models on Faculty platform. At present, the package mostly offers a way to cross validate models in parallel by means of Faculty jobs. To access the functionality one makes use of the class:

faculty_xval.validation.JobsCrossValidator

Additional information is found in the example notebooks provided. Please have a look at the section Try out the examples below.

The package supports keras and sklearn models. Whilst one can write custom models that are compatible with faculty-xval, no guarantee is given that the package handles these situations correctly, in particular because of issues concerning the randomisation of weights.

Two sets of installation instructions are provided below:

  • If you would like to simply use faculty-xval, please follow the User installation instructions.
  • If you would like to develop faculty-xval further, please follow the Developer installation instructions.

User installation instructions

Create an environment

In your project on Faculty platform, create an environment named faculty_xval. In the PYTHON section, select Python 3 and pip from the dropdown menus. Then, type faculty-xval in the text box, and click on the ADD button.

The environment installs the package faculty-xval, and should be applied on every server that you create; this includes both interactive servers and job servers, as explained next.

Create a job definition

Create a new job definition named cross_validation. In the COMMAND section, paste the following:

faculty_xval_jobs_xval $in_paths

Then, add a PARAMETER with the name in_paths, and ensure that the Make field mandatory box is checked.

Finally, under SERVER SETTINGS, add faculty_xval to the ENVIRONMENTS section.

For cross-validation jobs that are computationally intensive, we recommend using dedicated servers as opposed to running on shared infrastructure. To achieve this, click on Large and GPU servers under SERVER RESOURCES, and select an appropriate server type from the dropdown menu.

Remember to click SAVE when you are finished.

Developer installation instructions

Select a username

Before beginning the installation process, pick an appropriate username, such as foo. This does not necessarily need to match your Faculty platform username. In the following instructions, your selected username will be referred to as <USER_NAME>.

Clone the repository

Create the folder /project/<USER_NAME>. Then, run the commands:

cd /project/<USER_NAME>
git clone https://github.com/facultyai/faculty-xval.git

Create an environment

Next, create an environment in your project named faculty_xval_<USER_NAME>.

In this environment, under SCRIPTS, paste in the following code to the BASH section, remembering to change the USER_NAME definition on the second line to your selected <USER_NAME>:

# Remember to change username!
USER_NAME=<USER_NAME>

# Install faculty-xval from local repository.
pip install /project/$USER_NAME/faculty-xval/

# Turn USER_NAME into an environment variable.
echo "export USER_NAME=$USER_NAME" > /etc/faculty_environment.d/app.sh
if [[ -d /etc/service/jupyter ]] ; then
  sudo sv restart jupyter
fi

This environment should be applied on every server that you create; this includes both 'normal' interactive servers and job servers, as explained next.

Create a job definition

Next, create a new job definition named cross_validation_<USER_NAME>. In the COMMAND section, paste the following:

faculty_xval_jobs_xval $in_paths

Then, add a PARAMETER with the name in_paths, and ensure that the Make field mandatory box is checked.

Finally, under SERVER SETTINGS, add faculty_xval_<USER_NAME> to the ENVIRONMENTS section.

For cross-validation jobs that are computationally intensive, we recommend using dedicated servers as opposed to running in the cluster. To achieve this, click on Large and GPU servers under SERVER RESOURCES, and select an appropriate server type from the dropdown menu.

Remember to click SAVE when you are finished.

Try out the examples

Please clone this repository. Examples of cross validation with faculty-xval for the different types of model are provided in the directories examples/keras and examples/sklearn. Usage instructions are then divided in two notebooks:

  • jobs_cross_validator_run.ipynb loads the data, instantiates the model, and starts a Faculty job that carries out the cross validation.
  • jobs_cross_validator_analyse.ipynb gathers the results from the cross validation, reloads the target data, and calculates the model accuracy over multiple train-test splits.

Note that the example notebooks must be run in the order just defined.

Releases

No releases published

Packages

No packages published

Languages