Skip to content

Commit

Permalink
Added comment for CV as mentioned in the issue #63
Browse files Browse the repository at this point in the history
  • Loading branch information
ChristinaSchmidt1 committed Sep 15, 2023
1 parent 56b84bd commit b762ddb
Showing 1 changed file with 15 additions and 5 deletions.
20 changes: 15 additions & 5 deletions vignettes/Standard Metabolomics.Rmd
Expand Up @@ -146,6 +146,7 @@ The parameter `MVI` refers to Missing Value Imputation (MVI) and if `MVI = TRUE`
Lastly, the function `Preprocessing()` performs outlier detection and adds a column "Outliers" into the DF, which can be used to remove outliers. The parameter `HotellinsConfidence` can be used to choose the confidence interval that should be used for the Hotellins T2 outlier test [@Hotelling1931].\
\
Since our example data contains pool samples, we will do `Pool_Estimation()` before applying the `Preprocessing()` function. This is important, since one should remove the features (=metabolites) that are too variable prior to performing any data transformations such as TIC as part of the `Preprocessing()` function.\
It is worth mentioning that the Coefficient of variation (CV) is calculated by dividing the standard deviation (SD) by the mean. Hence CV depends on the SD, which in turn works for normally distributed data.\
```{r}
#Get the Pool data
PoolData <- MetaProViz::toy_data(data="Standard") %>%
Expand All @@ -159,7 +160,7 @@ MetaProViz::Pool_Estimation(Input_data = PoolData,
Input_SettingsFile = NULL,
Input_SettingsInfo = NULL,
Unstable_feature_remove = FALSE,
Therhold_cv = 1)
Threshold_cv = 1)
```
```{r, echo=FALSE}
# Check how our data looks like:
Expand All @@ -169,7 +170,7 @@ Pool_Estimation_result[1:5,]%>%
#kableExtra::scroll_box(width = "100%", height = "200px")
```
\
The results from the `Pool_Estimation()` is a table that has the Coefficient of variation (CV). If there is a high variability, one should consider to remove those features from the data. For the example data nothing needs to be removed. If you have used internal standard in your experiment you should specifically check their CV as this would indicate technical issues (here valine-d8 and hippuric acid-d5).\
The results from the `Pool_Estimation()` is a table that has the CV. If there is a high variability, one should consider to remove those features from the data. For the example data nothing needs to be removed. If you have used internal standard in your experiment you should specifically check their CV as this would indicate technical issues (here valine-d8 and hippuric acid-d5).\

```{r, eval=FALSE}
#Test out QC plots:
Expand Down Expand Up @@ -277,9 +278,11 @@ As input we will use the pre-processed data we have generated using the `Prepro
`1.` If all values of the replicates of one condition are NA/0 for a feature (=metabolite): Log2FC= Inf/-Inf and the statistics will be NA\
`2.` If some values of the replicates of one condition are NA/0 for a feature (=metabolite): Log2FC= positive or negative value, but the statistics will be NA\
\
Discuss the potential usage of data transformation prior to DMA (e.g. log transformation)--> Is this even possible to use in the DMA as we do Log2FC of log transformed data?
\
In the example data we have four different cell lines, healthy (HK2) and cancer (ccRCC: 786-M1A, 786-M2A and 786-O) and hence we can perform multiple different comparisons. The results are automatically saved and returned into the global environment. If parameter Plot=TRUE, an overview Volcano plot is generated and saved.\
```{r}
`DMA_786-O_vs_HK2` <- MetaProViz::DMA(Input_data=Intra_Preprocessed[,-c(1:4)], #we need to remove columns that do not include metabolite measurements
DMA_786O_vs_HK2 <- MetaProViz::DMA(Input_data=Intra_Preprocessed[,-c(1:4)], #we need to remove columns that do not include metabolite measurements
Input_SettingsFile=Intra_Preprocessed[,c(1:2)],#only maintain the information about condition and replicates
Input_SettingsInfo = c(conditions="Conditions", numerator="786-O", denominator = "HK2"),
STAT_pval ="t.test",#ProDA is another test! --> estimates missing information
Expand All @@ -289,8 +292,10 @@ In the example data we have four different cell lines, healthy (HK2) and cancer
CoRe=FALSE,
Plot = TRUE)
`DMA_786-O_vs_HK2`<- DMA_786O_vs_HK2[["DMA_Results"]]#Get the DMA table
#Perform the other comparisons:
`DMA_786-M1A_vs_HK2` <- MetaProViz::DMA(Input_data=Intra_Preprocessed[,-c(1:4)],
DMA_786M1A_vs_HK2 <- MetaProViz::DMA(Input_data=Intra_Preprocessed[,-c(1:4)],
Input_SettingsFile=Intra_Preprocessed[,c(1:2)],
Input_SettingsInfo = c(conditions="Conditions", numerator="786-M1A", denominator = "HK2"),
STAT_pval ="t.test",
Expand All @@ -300,7 +305,9 @@ In the example data we have four different cell lines, healthy (HK2) and cancer
CoRe=FALSE,
Plot = TRUE)
`DMA_786-M2A_vs_HK2` <- MetaProViz::DMA(Input_data=Intra_Preprocessed[,-c(1:4)],
`DMA_786-M1A_vs_HK2` <- DMA_786M1A_vs_HK2[["DMA_Results"]]#Get the DMA table
DMA_786M2A_vs_HK2 <- MetaProViz::DMA(Input_data=Intra_Preprocessed[,-c(1:4)],
Input_SettingsFile=Intra_Preprocessed[,c(1:2)],
Input_SettingsInfo = c(conditions="Conditions", numerator="786-M2A", denominator = "HK2"),
STAT_pval ="t.test",
Expand All @@ -310,6 +317,9 @@ In the example data we have four different cell lines, healthy (HK2) and cancer
CoRe=FALSE,
Plot = TRUE)
`DMA_786-M2A_vs_HK2` <- DMA_786M2A_vs_HK2[["DMA_Results"]]#Get the DMA table
```
```{r, echo=FALSE}
# Check how our data looks like:
Expand Down

0 comments on commit b762ddb

Please sign in to comment.