Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: keep protein usage reaction draw from protein pool when proteomics is integrated #375

Open
2 tasks done
edkerk opened this issue May 8, 2024 · 1 comment
Open
2 tasks done

Comments

@edkerk
Copy link
Member

edkerk commented May 8, 2024

Any comments on the following suggested change?

Currently, the enzyme usage reactions can be defined in two ways, dependent on whether proteomics data is integrated.

Model content Without proteomics With proteomics
Protein usage rxn prot_Q99312[c] <= prot_pool[c] prot_Q99312[c] <=
LB of protein usage rxn -1000 Measured Q99312 concentration, as taken from model.ec.concs, or potentially flexibilized by flexibilizeEnzConcs. Example = -0.0416
Protein pool exchange rxn prot_pool[c] <= prot_pool[c] <=
LB of protein pool exchange rxn Total enzyme content, as defined by Ptot * sigma * f. Example = -125 Non-measured enzyme content, as calculated by updateProtPool . Example = -95.915

A problem that I have encountered with this approach is that the new lower bound of the protein pool exchange reaction might be too strict. The model can no longer be solved, unless some proteins are flexibilized by a high amount (although sometimes this even does not resolve the problem).

  • In the calculation by updateProtPool, it assumes that the f-factor (fraction of protein being enzymes) is the same for both the measured- and unmeasured-protein fraction.
  • Already when the f-factor is first calculated, it is only based on the measured-protein fraction (if this data is available), which might be somewhat biased, but at that stage it would be countered out by the fitting of the sigma-factor.
  • In addition, to avoid over-constraining individual proteins based on noisy proteomics data, loadProtData by default adds 1 or more standard deviations to the protein measurements. As a consequence, the sum of measured protein concentrations Pmeas is substantially higher, which automatically means that the unmeasured protein fraction Ptot-Pmeas is always lower than it should be.

As an alternative, there is actually no good reason why the enzyme usage reaction has to change when proteomics data is integrated, except for changing its lower bound. The new approach suggested below would prevent the issues raised above, and instead would keep using the same lower bound for the protein pool exchange reaction that earlier in the model generation pipeline had been fitted to give realistic growth predictions. New suggestion:

Model content Without proteomics With proteomics
Protein usage rxn prot_Q99312[c] <= prot_pool[c] prot_Q99312[c] <= prot_pool[c]
LB of protein usage rxn -1000 Measured Q99312 concentration, as taken from model.ec.concs, or potentially flexibilized by flexibilizeEnzConcs. Example = -0.0416
Protein pool exchange rxn prot_pool[c] <= prot_pool[c] <=
LB of protein pool exchange rxn Total enzyme content, as defined by Ptot * sigma * f. Example = -125 Total enzyme content, as defined by Ptot * sigma * f. Example = -125

I hereby confirm that:

  • The new feature is not already in the main branch of the repository.
  • A similar issue does not already exist.
@Yu-sysbio
Copy link
Collaborator

The new suggestion looks very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants