sparsePKL - Sparse Pairwise Kernel Learning Software

sparsePKL is a pairwise kernel learning algorithm based on nonsmooth DC (difference of two convex functions) optimization. It learns sparse models for predicting in pairwise data (e.g. drug-target interactions) by using double regularization with both L1-norm and L0-pseudonorm. The nonsmooth DC optimization problem is solved using the limited memory bundle DC algorithm (LMB-DCA). In addition, sparsePKL uses pairwise Kronecker product kernels computed via generalized vec-trick to model interactions between drug and target features. The included loss-functions for the pairwise kernel problem are:

squared loss,
squared epsilon-insensitive loss,
epsilon-insensitive squared loss,
epsilon-insensitive absolute loss,
absolute loss.

Files included

sparsepkl.py
- Main python file. Includes RLScore calls.
pkl_utility.py
- Python utility programs.
sparsepkl.f95
- Main Fortran file for sparsePKL software.
lmbdca.f95
- LMB-DCA - the limited memory bundle DC algorithm.
solvedca.f95
- Limited memory bundle method for solving convex DCA-type of problems.
objfun.f95
- Computation of the function and subgradients values with different loss functions. Selection between loss functions is made in sparsepkl.py
initpkl.f95
- Initialization of parameters and variables in sparsePKL and LMB-DCA. Includes modules:
  - initpkl - Initialization of parameters for pairwise learning.
  - initlmbdca - Initialization of LMB-DCA.
parameters.f95
- Parameters for Fortran. Inludes modules:
  - r_precision - Precision for reals,
  - param - Parameters,
  - exe_time - Execution time.
subpro.f95
- subprograms for LMB-DCA and LMBM.
data.py
- Contains functions to load the example data sets. Data files are assumed to be in a folder "data" that is not part of the current folder.
- Contains functions to create train-test-validation splits. Splits are created for every experimental setting S1-S4 (see the reference below).
Makefile
- makefile: builds a shared library to allow sparsepkl (Fortran95 code) to be called from Python. Uses f2py, Python3.7, and requires a Fortran compiler (gfortran) to be installed.

Installation and usage

The source uses f2py and Python3.7, and requires a Fortran compiler (gfortran by default) and the RLScore to be installed.

To use the code:

Select the data, loss function, and the desired sparsity level from sparsepkl.py file.
Run Makefile (by typing "make") to build a shared library that allows sparsepkl (Fortran95 code) to be called from Python.
Finally, just type "python3.7 sparsepkl.py".

The algorithm returns a csv-file with performance measures (C-index and MSE) computed in the test set under different experimental settings S1-S4. The best results are selected using a separate validation set and validated w.r.t. C-index. In addition, separate csv-files with predictions under different experimental settings S1-S4 are returned.

References:

sparsePKL and LMB-DCA:
- N. Karmitsa, K. Joki, A. Airola, T. Pahikkala, "Limited memory bundle DC algorithm for sparse pairwise kernel learning", 2023.
RLScore:
- T. Pahikkala, A. Airola, "Rlscore: Regularized least-squares learners", Journal of Machine Learning Research, Vol. 17, No. 221, pp. 1-5, 2016.
LMBM:
- N. Haarala, K. Miettinen, M.M. Mäkelä, "Globally Convergent Limited Memory Bundle Method for Large-Scale Nonsmooth Optimization", Mathematical Programming, Vol. 109, No. 1, pp. 181-205, 2007.
- M. Haarala, K. Miettinen, M.M. Mäkelä, "New Limited Memory Bundle Method for Large-Scale Nonsmooth Optimization", Optimization Methods and Software, Vol. 19, No. 6, pp. 673-692, 2004.
Generalized vec trick and experimental settings:
- A. Airola, T. Pahikkala, "Fast kronecker product kernel methods via generalized vec trick", IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, pp. 3374–3387, 2018.
- M. Viljanen, A. Airola, T. Pahikkala, "Generalized vec trick for fast learning of pairwise kernel models", Machine Learning, Vol. 111, 543–573, 2022.
Nonsmooth optimization:
- A. Bagirov, N. Karmitsa, M.M. Mäkelä, "Introduction to nonsmooth optimization: theory, practice and software", Springer, 2014.

Acknowledgements

The work was financially supported by the Research Council of Finland projects (Project No. #345804 and #345805) led by Antti Airola and Tapio Pahikkala.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

sparsePKL - Sparse Pairwise Kernel Learning Software

Files included

Installation and usage

References:

Acknowledgements

About

Releases

Packages

Languages

License

napsu/sparsePKL

Folders and files

Latest commit

History

Repository files navigation

sparsePKL - Sparse Pairwise Kernel Learning Software

Files included

Installation and usage

References:

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages