Correction of turnover numbers in enzyme-constraint metabolic models

OS

Code was tested on Ubuntu 20.04.3 LTS and Windows 10

Dependencies

Matlab (tested with 2020a/b)
COBRA toolbox 3.0
RAVEN toolbox 2.0
Gurobi solver (tested with version 9.1.1)

Setup

Set up the COBRA toolbox following the installation instructions
Set up the RAVEN toolbox following the installation instructions
Set up the Gurobi solver and connect it with the COBRA toolbox using install instructions that can be found here

Run k_cat correction with PRESTO

To reproduce the results presented in the paper run the following scripts

S. cerevisiae: correct_kcats_yeast
E. coli: correct_kcats_ecoli

Generate raw and adapted ecModels

get_rawecmod() will generate among others raw enzyme constrained models (no kcat adaption, no manual modifications, no protein pool constraint) in the model folder of the respective GECKO path e.g GECKO_S_cerevisiae/model/ecYeast/rawYeast.mat
call get_GKOmod() or get_GKOmod_ecoli() to generate all condition specific adapted GECKO models using the experimentally determinded protein content, growth rate and uptake rates where available and also save them to the model folder .

To apply PRESTO to another model

create ecModel using the GECKO toolbox
create input data files

protAbcFile

This file contains the measured protein abundances in mmol/gDW. Give the protein UniProt IDs as row labels in the first column and the measurements of all conditions in the subsequent columns with condition IDs as column headers in the first row.

growthDataFile

The first column of this file contains the IDs of exchange reactions, the second one contains the name or note associated with the ID, and the subsequent columns contain the measued growth rates [h^-1] and exchange fluxes [mmol gDW^-1 h^-1] for the same conditions as in protAbcFile.

Example:

exchangeRxn	note	Condition_1	Condition_2	...
growth	biomass	0.1	0.15	...
ex_glc	glucose	-10	-12	...
ex_CO2	CO2	3	4	...
...	...	...	...	...

ptotFile

This is a file has two columns. The first one contains the condition IDs and the second one contains the total protein content in g gDW^-1. The first row contains column header.

maxKcatFile

This file contains the reference k_cat values and has five columns, which contain (1) EC number (2) substrate (3) lineage of the organism (4) k_cat [s^-1] (5) "*" (see also GECKO/databases/max_KCAT.txt).

mwFile (optional)

This file contains the molecular weights of all proteins in the model. If the file name is empty, the file does not exist or the dimensions of the protMW model field and the enzymesfield do not match, the file will be generated using the UniProt API.

modelFile

Give here the path to the ecModel generated using GECKO (without pool constaint).

batchModelFile

Path to the "batch model", which contains the protein pool constraint.

make adjustments to parameters and input file names in the configuration script

Parameter	Explanation
orgName	name of the organism
orgBasename	model basename
modelFile	file of the ecModel as Matlab workspace (.mat)
cobraSolver	preferred solver for linear optimization problems (default: gurobi)
runParallel	whether the correction should be run on multiple threads (default: true)
ncpu	number of threads (default: 20)
epsilon	upper limit for fold-change of k_cat values
lambda	weight for the minimization of absolute difference between measured and predicted growth rate(s)
theta	upper limit for the difference between measured and predicted growth rate(s)
GAM	value of growth-associated maintencance (put NaN if unknown, will be fitted using all provided conditions)
f	mass fraction of all proteins accounted for by the model (see GECKO publication)
f_n	mass fraction of unmeasured proteins in the ecModel (for inclusion of unmeasured proteins; put NaN if unknown, will be fitted for each condition separately) (see GECKO publication)
sigma	average saturation of enzymes in the model (see GECKO publication); can be fitted using GECKO sigmaFitter)
nIter	number of iterations for k-fold cross-validation
geckoDir	path to the organism-specific GECKO directory

run updated configuration file (step 3)
run cvLambdaFitting to estimate the optimal weighting parameter $\lambda$

[relErr,errVar,sumsDelta,objVal,avJD,corrKcatProts] = cvLambdaFitting(...
    model,...                       % GECKO ecModel
    expGrowth,...                   % experimental growth rates for all conditions
    PTot,...                        % total protein contents for all conditions
    E,...                           % enzyme abundance matrix (#model proteins x #conditions)
    lambdaParams,...                % array of lambda parameters to be explored
    nutrExch,...                    % nutrient exchange rates
    'kfold', kfold,...	            % (optional) number of folds for cross-validation
    'nIter', nIter,...		    % (optional) number of iterations for k-fold cross-validation
    'epsilon', epsilon,...          % (optional) maximum allowed fold change of k<sub>cat</sub> values
    'theta', theta,...              % (optional) maximum allowed relative error
    'runParallel', runParallel,...  % (optional) whether to run the cross-validation on multiple workers
    'enzMetPfx', enzMetPfx,...	    % (optional) prefix for protein metabolites (by default prot_ as added by GECKO)
    'enzRxnPfx', enzRxnPfx,...	    % (optional) prefix for protein draw reactions (by default prot_ as added by GECKO)
    'enzBlackList',enzBlackList     % (optional) list of protein IDs that should be excluded from the correction
    'K', K,...			    % (optional) maximum allowed k<sub>cat</sub> value after correction
    'negCorrFlag', negCorrFlag      % (optional) if true, a second step is added, which attempts to find negative corrections for k<sub>cat</sub> values
    'GAM', GAM,...                  % (optional) growth associated maintenance
    'f', f,...                      % (optional) f factor for protein pool (see GECKO paper or description above)
    'sigma', sigma...               % (optional) sigma factor for protein pool (see GECKO paper or description above)
    );

adjust ecModel to experimental conditions using adjBaseModel

adj_models = adjBaseModel(...
    model,...       % GECKO ecModel
    P,...           % total protein contents for all conditions
    nutrExch,...    % nutrient exchange rates
    GAM...          % growth associated maintenance
    );

run PRESTO to obtain k_cat corrections

[solution,corr_models,relError,changeTab,LP] = PRESTO(...
    adj_models,...     		  % enzyme-constraint metabolic model(s)
    expGrowth,...      		  % experimental growth rates for all conditions
    E...               		  % enzyme abundance matrix (#model proteins x #conditions)
    'lambda', lambda		  % (optional) weighting parameter lambda
    'epsilon', epsilon,...        % (optional) maximum allowed fold change of k<sub>cat</sub> values
    'theta', theta,...            % (optional) maximum allowed relative error
    'enzBlackList', enzBlackList  % (optional) list of protein IDs that should be excluded from the correction
    'enzMetPfx', enzMetPfx,...	  % (optional) prefix for protein metabolites (by default prot_ as added by GECKO)
    'enzRxnPfx', enzRxnPfx,...	  % (optional) prefix for protein draw reactions (by default prot_ as added by GECKO)
    'negCorrFlag', negCorrFlag    % (optional) if true, a second step is added, which attempts to find negative corrections for k<sub>cat</sub> values
    );

Reference

Wendering, P., Arend, M., Razaghi-Moghadam, Z. et al. Data integration across conditions improves turnover number estimates and metabolic predictions. Nat Commun 14, 1485 (2023). https://doi.org/10.1038/s41467-023-37151-2

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
Data		Data
GECKO_E_coli		GECKO_E_coli
GECKO_S_cerevisiae		GECKO_S_cerevisiae
Logs		Logs
Program		Program
Results		Results
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data

Data

GECKO_E_coli

GECKO_E_coli

GECKO_S_cerevisiae

GECKO_S_cerevisiae

Logs

Logs

Program

Program

Results

Results

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Correction of turnover numbers in enzyme-constraint metabolic models

OS

Dependencies

Setup

Run k_cat correction with PRESTO

To reproduce the results presented in the paper run the following scripts

Generate raw and adapted ecModels

To apply PRESTO to another model

Reference

About

Releases 1

Packages

Contributors 2

Languages

License

pwendering/PRESTO

Folders and files

Latest commit

History

Repository files navigation

Correction of turnover numbers in enzyme-constraint metabolic models

OS

Dependencies

Setup

Run kcat correction with PRESTO

To reproduce the results presented in the paper run the following scripts

Generate raw and adapted ecModels

To apply PRESTO to another model

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

Run k_cat correction with PRESTO