Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update media definitions, document and extend db #16

Closed
67 of 72 tasks
famosab opened this issue Jun 28, 2022 · 27 comments · Fixed by #80 or #119
Closed
67 of 72 tasks

Update media definitions, document and extend db #16

famosab opened this issue Jun 28, 2022 · 27 comments · Fixed by #80 or #119
Labels
enhancement New feature or request

Comments

@famosab
Copy link
Member

famosab commented Jun 28, 2022

I will collect the ToDos from this thread here @GwennyGit maybe you can mark what you did already and on what branch :D

  • remove CGXII and replace by CGXlab but rename to CGXII
  • remove LB and M9 without oxygen
  • write function to simulate without oxygen Add boolean to enable anaerobic growth simulation to all relevant functions

  • Re-evaluate all media in database:
  • document all available media in the documentation
    • Create RST file(s) for the media definitions
    • LB [7]
    • M9 [6]
    • CGXII [4]
    • SNM3 [1]
    • SMM
    • RPMI [3]
    • CasA [5]
    • Blood [9] -> $\textcolor{red}{\text{Check where the Blood medium came from originally again!}}$
    • Urine MP-AU [10]
    • dGMM [8]
    • SLM [12] -> Artificial Sebum, Sweat & Basal medium belong to it!
    • TSB
  • Document all available subsets in the documentation
    • Create RST file(s) for the subset definitions
    • CasA
    • protAA
    • artSe
  • Add page 'How to obtain an in sillico medium from a laboratory medium' to documentation
  • Add Blood [2] [9]
  • Add Urine [2] Add MP-AU [10]
  • Add defined gut microbiota medium (dGMM) [8]
  • Add SLM components [12]:
    • Add Artificial Sweat to medium
    • Add Artificial Sebum to subset
    • Add Basal medium (Name from paper, Renamed to: BMS23 for Basal Medium Swaney 2023)

  • Add compound names to:
    • RPMI
    • SNM3
  • Add subsets:
    • CasA - Contains only Casamino acids
    • AA - Contains all proteinogenic amino acids
    • DiReM - Metabolites for the dissipation reactions
      • already in: 'Water [H2O]', 'ATP [Adenosine triphosphate]','ADP','Hydrogen [H(+)]','Phosphate [PO4(3-)]',
        'CDP','GTP [Guanosine triphosphate]', 'GDP [Guanosine diphosphate]', 'UDP [Uridine 5-diphosphate]',
        'Nicotinamide adenine dinucleotide [NAD]', 'Nicotinamide adenine dinucleotide phosphate [NADP]',
        'Flavin adenine dinucleotide oxidized [FAD]', 'FMN [Flavin Mononucleotide]', 'D-Glucose',
        '2-Oxoglutarate [Oxoglutaric acid]', 'Ammonia'
      • missing:
        • CTP
        • UTP
        • ITP
        • IDP
        • NADH
        • NADPH
        • FADH2
        • FMNH2
        • Q8H2
        • Q8
        • MQL8
        • MQN8
        • 2DMMQL8
        • 2DMMQ8
        • ACCOA
        • AC
        • CoA
        • DUPLICATE: Acetate and Acetate [Acetic acid] : remove the former + replace in subset, ....
      • 🔴 NOTE: these reactions are all about proton exchange, might lead to problems, since we usually ignore difference in protons .... in these cases it might be important to watch out for them - maybe also use this as a set to skip during duplicate removal, as this also includes protons differences
  • Check in generate_insert_query for strings in value_string

Feature request for maintenance

  • Add function to update tables/specific table entries automatically

  • More database identifiers -> Check BiGG+VMH overlaps
  • Add superoxide ❗
  • Add a definition for Tryptic Soy broth (TSB)
  • Add a definition for Brain Heart Infusion (BHI)

[1]
Krismer, Bernhard; Liebeke, Manuel; Janek, Daniela; Nega, Mulugeta; Rautenberg, Maren; Hornig, Gabriele et al. (2014): Nutrient Limitation Governs Staphylococcus aureus Metabolism and Niche Adaptation in the Human Nose. In: PLOS Pathogens 10 (1), e1003862. DOI: 10.1371/journal.ppat.1003862.
[2]
Ding T, Case KA, Omolo MA, Reiland HA, Metz ZP, Diao X, Baumler DJ. Predicting Essential Metabolic Genome Content of Niche-Specific Enterobacterial Human Pathogens during Simulation of Host Environments. PLoS One. 2016 Feb 17;11(2):e0149423. doi: 10.1371/journal.pone.0149423. PMID: 26885654; PMCID: PMC4757543.
[3]
https://www.thermofisher.com/de/de/home/technical-resources/media-formulation.114.html
[4]
Unthan, Simon, et al. "Beyond growth rate 0.6: What drives Corynebacterium glutamicum to higher growth rates in defined medium." Biotechnology and bioengineering 111.2 (2014): 359-371., Preparation protocol
[5]
Richard A. Nolan (1971) Amino Acids and Growth Factors in Vitamin-Free Casamino Acids, Mycologia, 63:6, 1231-1234, DOI: 10.1080/00275514.1971.12019223
[6]
https://www.sigmaaldrich.com/DE/de/product/sigma/m6030, Preparation protocol
[7]
Machado, Daniel, et al. "Fast automated reconstruction of genome-scale metabolic models for microbial species and communities." Nucleic acids research 46.15 (2018): 7542-7553.
https://carveme.readthedocs.io/en/latest/advanced.html#media-database
[8]
Tramontano, M., Andrejev, S., Pruteanu, M. et al. Nutritional preferences of human gut bacteria reveal their metabolic idiosyncrasies. Nat Microbiol 3, 514–522 (2018). https://doi.org/10.1038/s41564-018-0123-9
[9]
Nantia Leonidou, Alina Renz, Reihaneh Mostolizadeh, and Andreas Dräger. New workflow predicts drug targets against sars-cov-2 via metabolic changes in infected cells. PLOS Computational Biology, 19(3):1–32, 03 2023. URL: https://doi.org/10.1371/journal.pcbi.1010903, doi:10.1371/journal.pcbi.1010903.
[10]
Sarigul, N., Korkmaz, F. & Kurultak, İ. A New Artificial Urine Protocol to Better Imitate Human Urine. Sci Rep 9, 20159 (2019). https://doi.org/10.1038/s41598-019-56693-4
[11]
Oh, Y. K., Palsson, B. O., Park, S. M., Schilling, C. H., & Mahadevan, R. (2007). Genome-scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data. Journal of Biological Chemistry, 282(39), 28791-28799. https://doi.org/10.1074/jbc.M703759200
[12]
Swaney MH, Nelsen A, Sandstrom S, Kalan LR. Sweat and Sebum Preferences of the Human Skin Microbiota. Microbiol Spectr. 2023 Feb 14;11(1):e0418022. doi: 10.1128/spectrum.04180-22. Epub 2023 Jan 5. PMID: 36602383; PMCID: PMC9927561

@famosab
Copy link
Member Author

famosab commented Jun 28, 2022

I added Casamino Acids to the media definition based on an article from 1971 - Amino Acids and Growth Factors in Vitamin-Free Casamino Acids.

@famosab famosab added the enhancement New feature or request label Jul 22, 2022
famosab added a commit that referenced this issue Oct 21, 2022
@famosab
Copy link
Member Author

famosab commented Feb 1, 2023

@GwennyGit maybe you can add the gut medium as soon as you get to that. But we might rethink how we note the media definitions. At the moment it is just a big csv file which works but entering it into the database we use for sboann might be more elegant?

@GwennyGit
Copy link
Collaborator

GwennyGit commented Feb 1, 2023

The idea of using the database sounds good. However, if someone would want to use the media definitions in another program like for example the gapfill function of CarveMe it is easier to transform the CSV file into the required format. Additionally, it might be easier to use the CSV file to add other media if the user wants to. @famosab What do you think? 🤔

@famosab
Copy link
Member Author

famosab commented Feb 16, 2023

I think we should move the existing media definitions into the database as well. You mentioned somewhere that access via pandas should be possible. If that works for a user that just installs refineGEMs via pip, it would be great! Maybe we can implement a function which exports the database entries to a csv medium definition. The functionality for a possible user would still be the same since they could just use a local csv as well.

@GwennyGit
Copy link
Collaborator

GwennyGit commented Feb 16, 2023

Yes, I mentioned that in issue #49, and I am currently working on that task.

@famosab
Copy link
Member Author

famosab commented Mar 3, 2023

At the moment we have both CGXII and CGXIlab in the database. I would advise to remove CGXII and replace it by the composition of CGXlab since that is the composition which is used in for the manuscript we will publish soon and CGXII is just a file I got a while ago but is not verfiied with laboratory use. We could also remove LB and M9 without oxygen or write a small function to allow for anaerobic simulation on any of the media.

@GwennyGit
Copy link
Collaborator

GwennyGit commented Mar 3, 2023

Removing CGXII and only keeping CGXlab of these two media is a good idea. However, I think it would also be good to describe all media in the documentation so that the user knows what it is, why the user could use which and so on.


Yeah, creating a small function to allow for anaerobic simulation on any media sounds like a great idea. Then this simulation part would not only be restricted to M9 and LB.

GwennyGit referenced this issue Mar 3, 2023
load_medium_from_db loads now a table for a specified medium from the database data.db. The table is returned as pandas data frame and contains the medium composition.
@famosab famosab changed the title Add data to use in refinement and growth simulation Update media definitions, document and extend db Mar 4, 2023
GwennyGit added a commit that referenced this issue Mar 10, 2023
@GwennyGit
Copy link
Collaborator

I created so far only the basic set-up for the media definition pages. Thus the pages still need to be filled with content.

@GwennyGit
Copy link
Collaborator

GwennyGit commented Mar 23, 2023

Re-evaluation of SNM3
The composition of SNM3 within refineGEMs was compared to another composition used within the draeger-lab group as well as to the original wet lab composition described in 'Nutrient Limitation Governs Staphylococcus aureus Metabolism and Niche Adaptation in the Human Nose' [1]. Additionally, all compound assignments to the BiGG database were checked and the names added with the following pattern: BiGG ID [Name in wet lab description].

From the comparison the following differences were found:

  1. For Cyanocobalamine which is Vitamin B12 the SNM3 definition in refineGEMs listed adocbl which is the BiGG ID for Adenosylcobalamine. As the chemical formula is no good match for Cyanocobalamine this compound was removed from the definition.
    -> Replaced adocbl with cbl1 (Cob(I)alamine), cbl2 (Cob(II)alamine and b12 (Vitamin B12). The first two BiGG IDs were chosen due to having a high similarity to the chemical formula of Cyanocobalamine and being already included in the SNM3 definition from the draeger-lab group. b12 was added as Cyanocobalamine is Vitamin B12.
  2. In the draeger-lab group both fe2 and fe3 are contained in the SNM3 definition. However, the refineGEMs definition only contained fe2. Thus, fe3 was added.

In conclusion after discussing with @famosab we decided to add all analoga for each compound for all media. Hence, the addition of all possible similar compounds to Cyanocobalamine and Iron (Fe).


[1]
Krismer, Bernhard; Liebeke, Manuel; Janek, Daniela; Nega, Mulugeta; Rautenberg, Maren; Hornig, Gabriele et al. (2014): Nutrient Limitation Governs Staphylococcus aureus Metabolism and Niche Adaptation in the Human Nose. In: PLOS Pathogens 10 (1), e1003862. DOI: 10.1371/journal.ppat.1003862.

GwennyGit added a commit that referenced this issue Mar 23, 2023
GwennyGit added a commit that referenced this issue Mar 23, 2023
- Centered all tables & figures
- Added library for references
@famosab
Copy link
Member Author

famosab commented Mar 24, 2023

Re-evaluation of RPMI
The composition of the in silico RPMI medium was compared to the provider reference. This comparison yielded the following points:

  • tyr__L is chosen for L-Tyrosine disodium salt dihydrate
  • inost is Myo-Inositol but RPMI contains I-Inositol, these two are different but since there is no BiGG Id for I-Inositol yet it was left inside
  • hco3 was added since Sodium Bicarbonate is contained
  • 4hpro_LT was added since L-Hydroxyproline is contained
  • b12 was added for Vitamin B12, cbl1 was kept and clb1 was removed since it is not a correct BiGG Id
  • nac was removed since the medium does only contain Niacinamide which is covered by ncam
  • Phenol Red has not BiGG Id so far, the corresponding KEGG Id is C12600, it was added without BiGG Id
  • h was added since L-Cysteine HCl is contained

@famosab
Copy link
Member Author

famosab commented Mar 24, 2023

Re-evaluation of M9

The M9 composition is based on the provider reference for the minimal salts:

  • KH2PO4: k, h, pi
  • NaCl: na1, cl
  • Na2HPO4: na1, h, pi
  • NH4Cl: nh4, cl

Plus necessary additives as described here and here:

  • MgSO4: mg2, so4
  • CaCl2: ca2, cl
  • Glucose: glc__D

And o2 and h2o are present per standard.

famosab added a commit that referenced this issue Mar 24, 2023
famosab added a commit that referenced this issue Mar 24, 2023
famosab added a commit that referenced this issue Mar 24, 2023
@GwennyGit
Copy link
Collaborator

GwennyGit commented Mar 26, 2023

Addition of the defined Gut Microbiota Medium (dGMM) to the database
To get all relevant BiGG IDs for the salts the following table was used:

Ion Abundance BiGG ID BiGG name
Fe(II) 1 fe2 Fe2+
SO4 6 so4 Sulfate
Zn 1 zn2 Zinc
Co 1 cobalt2 Co2+
NO3 1 no3 Nitrate
Al 1 - -
K 2 k Potassium
Na 5 na1 Sodium
SeO3 1 slnt Selenite
WO4 1 tungs Tungstate
Ni 1 ni2 Nickel
Cl 3 cl Chloride
Ca 1 ca2 Calcium
Cu 1 cu2 Copper
Mn 1 mn2 Manganese
Mg 1 mg2 Magnesium
HCO3 1 hco3 Bicarbonate
MoO4 1 mobd Molybdate
H 1 h Hydrogen
PO4 1 pi Phosphate

Additionally, water was added to the definition as most salts were added with water in the laboratory version of GMM. For Resazurin, boric acid (H3BO3), Aluminium (Al), dihydrogen phosphate (H2PO4) and EDTA no BiGG IDs were found. However, dihydrogen phosphate could be separated into hydrogen (H) and phosphate (PO4) for which BiGG IDs exist. This was changed in commit (Needs to be committed❗).

GwennyGit added a commit that referenced this issue Mar 26, 2023
GwennyGit added a commit that referenced this issue Mar 26, 2023
@cb-Hades cb-Hades mentioned this issue Feb 6, 2024
40 tasks
GwennyGit added a commit that referenced this issue Feb 13, 2024
GwennyGit added a commit that referenced this issue Feb 13, 2024
GwennyGit added a commit that referenced this issue Feb 19, 2024
GwennyGit added a commit that referenced this issue Feb 20, 2024
GwennyGit added a commit that referenced this issue Feb 20, 2024
Updated these files with information on subsets
cb-Hades added a commit that referenced this issue Feb 20, 2024
GwennyGit added a commit that referenced this issue Feb 20, 2024
cb-Hades added a commit that referenced this issue Feb 20, 2024
GwennyGit added a commit that referenced this issue Feb 20, 2024
cb-Hades added a commit that referenced this issue Feb 20, 2024
GwennyGit added a commit that referenced this issue Feb 20, 2024
- Addition of DiReM
- Minor fixes for incorrect substance names
cb-Hades added a commit that referenced this issue Feb 20, 2024
GwennyGit added a commit that referenced this issue Feb 20, 2024
GwennyGit added a commit that referenced this issue Feb 21, 2024
- Added better description of boolean 'update_entries'
- Changed FLOAT_REGEX.match to FLOAT_REGEX.fullmatch
GwennyGit added a commit that referenced this issue Feb 21, 2024
cb-Hades added a commit that referenced this issue Apr 3, 2024
@GwennyGit GwennyGit linked a pull request Apr 12, 2024 that will close this issue
cb-Hades added a commit that referenced this issue Apr 30, 2024
@cb-Hades
Copy link
Collaborator

cb-Hades commented Jun 3, 2024

Database restructuring finished, namespace issues moved to #36 and ideas for new media have been added to #123

@cb-Hades cb-Hades closed this as completed Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants