nonconvexAG

Kai Yang

<kai.yang2 "at" mail.mcgill.ca>

License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)

GPG Public key Fingerprint: CC02CF153594774CF956691492B2600D18170329

This repository contains simulation study codes and results for my paper, Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties -- DOI link for publication or arXiv link for preprint.

The paper-related material is under this directory. All the codes and outputs (both intermediate outputs and final outputs) can be found here.
All the results were run on Compute Canada. The job submission bash scripts contain commands showing the computing resource name and information in the slurm outputs; the seff outputs show the computing time.
All studies using GPUs were run on Compute Canada Nvidia A100 GPU(s), which the Compute Canada slurm outputs confirm and show to have a CUDA compute capability of 8.0 (cupy.cuda.Device.compute_capability returns a string '80' stands for compute capability of 8.0).
All the Python scripts referred in the bash job submission scripts were generated from the Jupyter (iPython) notebooks (i.e., jupyter nbconvert *.ipynb --to python and move the python script to a separate sub-directory called dist).
Compute Canada has slurm-[jobID].out files consisting of outputs from running the scripts; I also created seff-[jobID].out files from command seff [jobID] >> seff-[jobID].output to record and report the wall-clock times to finish the computing-time comparison jobs.
With identical simulation setups, under the same $(\epsilon-)$ convergence criteria, seff files show that the computing time simulations for the AG method finished within $20$ minutes for SCAD or MCP-penalized logistic models; however, the computing time simulations could not finish within the 7-day time limit imposed by Compute Canada Narval cluster for the coordinate descent method on SCAD or MCP-penalized logistic models.
Again, all the above simulations were run on identical GPUs -- Nvidia A100 with CUDA compute capability of 8.0.
To ensure the fairness of comparison, we coded coordinate descent in Python/CuPy and compared the computing time with AG -- this was coded based on the state-of-the-art pseudo-code for the coordinate descent method (Breheny & Huang, 2011).

Model	Penalty	Comparison	Optimization Method	Output Data	Jupyter Notebook/R code	Bash Script	slurm file	seff output
Penalized Linear Models (LM)	SCAD	Signal Recovery Performance	coordinate descent (`ncvreg`); with strong rule	`R_results_SCAD_signal_recovery.npy`	`ncvreg_LM_sim.R`	`LM.sh`	`slurm-10933899.out`
Penalized Linear Models (LM)	MCP	Signal Recovery Performance	coordinate descent (`ncvreg`); with strong rule	`R_results_MCP_signal_recovery.npy`	`ncvreg_LM_sim.R`	`LM.sh`	`slurm-10933899.out`
Penalized Logistic Models	SCAD	Signal Recovery Performance	coordinate descent (`ncvreg`); with strong rule	`R_results_SCAD_signal_recovery.npy`	`ncvreg_logistic_sim.R`	`logistic.sh`	`slurm-10933900.out`
Penalized Logistic Models	MCP	Signal Recovery Performance	coordinate descent (`ncvreg`); with strong rule	`R_results_MCP_signal_recovery.npy`	`ncvreg_logistic_sim.R`	`logistic.sh`	`slurm-10933900.out`
Penalized Linear Models (LM)	SCAD	Signal Recovery Performance	AG (proposed optimization hyperparameters); with strong rule	`results_SCAD_signal_recovery.npy`	`task1.ipynb`	`task1.sh`	`slurm-10933901.out`
Penalized Linear Models (LM)	MCP	Signal Recovery Performance	AG (proposed optimization hyperparameters); with strong rule	`results_MCP_signal_recovery.npy`	`task1.ipynb`	`task1.sh`	`slurm-10933901.out`
Penalized Logistic Models	SCAD	Signal Recovery Performance	AG (proposed optimization hyperparameters); with strong rule	`results_SCAD_signal_recovery.npy`	`task2.ipynb`	`task2.sh`	`slurm-10933902.out`
Penalized Logistic Models	MCP	Signal Recovery Performance	AG (proposed optimization hyperparameters); with strong rule	`results_MCP_signal_recovery.npy`	`task2.ipynb`	`task2.sh`	`slurm-10933902.out`
Penalized Linear Models (LM)	SCAD	Number of Gradient Evaluations	AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent	`SCAD_sim_results.npy`	`task1speed.ipynb`	`task1speed.sh`	`slurm-10933903.out`	`seff-10933903.out`
Penalized Linear Models (LM)	SCAD	GPU Computing Time	AG (proposed optimization hyperparameters), coordinate descent (coded in `Python/CuPy`)	`SCAD_sim_results.npy`	`task1speed.ipynb`	`task1speed.sh`	`slurm-10933903.out`	`seff-10933903.out`
Penalized Linear Models (LM)	MCP	Number of Gradient Evaluations	AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent	`MCP_sim_results.npy`	`task1speed.ipynb`	`task1speed.sh`	`slurm-10933903.out`	`seff-10933903.out`
Penalized Linear Models (LM)	MCP	GPU Computing Time	AG (proposed optimization hyperparameters), coordinate descent (coded in `Python/CuPy`)	`MCP_sim_results.npy`	`task1speed.ipynb`	`task1speed.sh`	`slurm-10933903.out`	`seff-10933903.out`
Penalized Logistic Models	SCAD	GPU Computing Time	coordinate descent (coded in `Python/CuPy`)		`task2speed_SCAD_coord_time.ipynb`	`task2speed_SCAD_coord_time.sh`	`slurm-10933904.out`	`seff-10933904.out`
Penalized Logistic Models	MCP	GPU Computing Time	coordinate descent (coded in `Python/CuPy`)		`task2speed_MCP_coord_time.ipynb`	`task2speed_MCP_coord_time.sh`	`slurm-10933905.out`	`seff-10933905.out`
Penalized Logistic Models	SCAD	GPU Computing Time	AG (proposed optimization hyperparameters)	`SCAD_sim_results_AG_time.npy`	`task2speed_SCAD_AG_time.ipynb`	`task2speed_SCAD_AG_time.sh`	`slurm-10933906.out`	`seff-10933906.out`
Penalized Logistic Models	MCP	GPU Computing Time	AG (proposed optimization hyperparameters)	`MCP_sim_results_AG_time.npy`	`task2speed_MCP_AG_time.ipynb`	`task2speed_MCP_AG_time.sh`	`slurm-10933907.out`	`seff-10933907.out`
Penalized Logistic Models	SCAD	Number of Gradient Evaluations	AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent	`SCAD_sim_results.npy`	`task2speed_SCAD.ipynb`	`task2speed_SCAD.sh`	`slurm-10933908.out`	`seff-10933908.out`
Penalized Logistic Models	MCP	Number of Gradient Evaluations	AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent	`MCP_sim_results.npy`	`task2speed_MCP.ipynb`	`task2speed_MCP.sh`	`slurm-10933909.out`	`seff-10933909.out`

The summary of the simulation results is in this Jupyter notebook, where most of the simulation study results in the paper are from.
The two very original Jupyter notebooks are the simulation study for SCAD/MCP-penalized linear models and the simulation study for SCAD/MCP-penalized logistic models -- all other notebooks and Python codes are generated and modified based on them. They are also where the plots in the main text come from.
To run on the server, I divided the codes into several chunks; they were in this folder for python simulations
- task1 and task2 contain files to test signal recovery performance for SCAD/MCP-penalized linear models and logistic models using AG;
- task1speed and task2speed contain files to test $(\epsilon-)$ convergence speed and computing times for SCAD/MCP-penalized linear models and logistic models using AG v.s. proximal gradient v.s. coordinate descent.
The R codes and results for ncvreg simulations are contained in this directory -- click here for penalized linear models or here for penalized logistic models.
Some algebra calculations from the paper can be replicated using this SageMath notebook (SageMath); the MATLAB codes to generate plots are here for "Figure 1: Numerical plots for Corollary 1.".

Bibliography

Breheny, P., & Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics 2011, Vol. 5, No. 1, 232-253. https://doi.org/10.1214/10-AOAS388

Name		Name	Last commit message	Last commit date
Latest commit History 168 Commits
.ipynb_checkpoints		.ipynb_checkpoints
future implementation/cp		future implementation/cp
nonconvexAG		nonconvexAG
paper		paper
.gitignore		.gitignore
LICENSE		LICENSE
LM_SCAD_MCP_cp_MODULE.ipynb		LM_SCAD_MCP_cp_MODULE.ipynb
LM_SCAD_MCP_np.memmap_MODULE.ipynb		LM_SCAD_MCP_np.memmap_MODULE.ipynb
LM_SCAD_MCP_np.memmap_MODULE_parallel.ipynb		LM_SCAD_MCP_np.memmap_MODULE_parallel.ipynb
LM_SCAD_MCP_np_MODULE.ipynb		LM_SCAD_MCP_np_MODULE.ipynb
README.md		README.md
logistic_SCAD_MCP_cp_MODULE.ipynb		logistic_SCAD_MCP_cp_MODULE.ipynb
logistic_SCAD_MCP_np.memmap_MODULE.ipynb		logistic_SCAD_MCP_np.memmap_MODULE.ipynb
logistic_SCAD_MCP_np.memmap_MODULE_parallel.ipynb		logistic_SCAD_MCP_np.memmap_MODULE_parallel.ipynb
logistic_SCAD_MCP_np_MODULE.ipynb		logistic_SCAD_MCP_np_MODULE.ipynb
markdown_table.tgn		markdown_table.tgn

License

Kaiyangshi-Ito/nonconvexAG

Folders and files

Latest commit

History

Repository files navigation

nonconvexAG

Kai Yang

<kai.yang2 "at" mail.mcgill.ca>

License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)

Bibliography

About

Resources

License

Stars

Watchers

Forks

Languages