Skip to content

Code uses different machine learning methods to predict the power conversion efficiency of organic photovoltaic devices with non-fullerene acceptors

License

Notifications You must be signed in to change notification settings

marcosdelcueto/NonFullereneAcceptorPrediction

Repository files navigation

Non-Fullerene Acceptor Prediction

This repository contains the database and code for Limitations of machine learning models when predicting compounds with completely new chemistries: possible improvements applied to the discovery of new non-fullerene acceptors by Z-W Zhao, M del Cueto and A Troisi

Code is based in our previous MLPhotovoltaics program, with the main addition of performing LOO-extrapolation and LOGO-extrapolation. More details on these method can be found in the manuscript. These two cross-validations can be controlled with the following input keywords:

  • CV='groups' (LOO-extrapolation) OR CV='logo' (LOGO-extrapolation)
  • acceptor_label_column: allows to set the name of the column that contains the acceptor labels
  • groups_acceptor_labels: allows to assign pairs whose acceptor has a specific label to a group
  • group_test: select which of the previous groups is used as test. The rest will be used as training

Prerequisites

The necessary packages (with the tested versions with Python 3.8.5) are specified in the file requirements.txt. These packages can be installed with pip:

pip3 install -r requirements.txt

Usage

All input parameters are specified in file: inputNonFullereneAcceptorPrediction.inp. Input options in this file are separated in different groups:

  • Parallelization: only relevant when trying to use differential evolution algorithm to optimize hyperparameters
  • Verbose options: allows some flexibility for how much information to print to standard output and log file
  • Data base options: allows to select how many donor/acceptor pairs are used, as well as which descriptors are considered
  • Output prediction csv: allows to print the actual and predicted target properties values of the test points
  • Machine Learning Algorithm options: allows to select what ML algorithm is used, as well as cross validation method, hyperparameters etc.

To execute the program, make sure that you have all necessary python packages installed, and that all necessary files are present: the database (database.csv), input file (inputNonFullereneAcceptorPrediction.inp) and program (NonFullereneAcceptorPrediction.py). Finally, simply run:

python NonFullereneAcceptorPrediction.py

Examples

The folders reproduce_Table1 and reproduce_Table2 contain all necessary files to reproduce the main results of the manuscript. To do this, simply execute the bash scripts: reproduce_Table1.sh and reproduce_Table2.sh inside their respective folders.


License

Authors: Zhi-Wen Zhao, Marcos del Cueto and Alessandro Troisi

Licensed under the MIT License

About

Code uses different machine learning methods to predict the power conversion efficiency of organic photovoltaic devices with non-fullerene acceptors

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published