Skip to content

Latest commit

 

History

History
137 lines (108 loc) · 4.04 KB

seminar03c.md

File metadata and controls

137 lines (108 loc) · 4.04 KB

seminar03c

Ali
Monday, January 26, 2015
Choose 100 random samples to explore

#install.packages("hexbin")
library(lattice)
library(hexbin)
prDat <- read.table("GSE4051_data.txt")
#str(prDat, max.level = 0)
prDes <- readRDS("GSE4051_design.rds")
#str(prDes)
set.seed(1)
(yo <- sample(1:nrow(prDat), size = 100))
##   [1]  7952 11145 17156 27198  6040 26902 28287 19786 18837  1850  6167
##  [12]  5286 20568 11499 23046 14899 21481 29690 11375 23269 27975  6350
##  [23] 19503  3758  7997 11555   401 11442 26023 10184 14424 17938 14766
##  [34]  5571 24751 19997 23759  3229 21647 12302 24554 19353 23416 16540
##  [45] 15842 23605   698 14271 21897 20713 14281 25749 13098  7319  2113
##  [56]  2974  9455 15504 19788 12161 27285  8776 13721  9934 19452  7711
##  [67] 14301 22899  2518 26155 10132 25081 10358  9972 14231 26654 25821
##  [78] 11650 23220 28694 12983 21282 11947  9717 22611  6054 21237  3634
##  [89]  7331  4280  7156  1760 19177 26162 23255 23803 13592 12242 24206
## [100] 18058
hDat <- prDat[yo, ]
#str(hDat)
hDat <- as.matrix(t(hDat))
rownames(hDat) <- with(prDes,
                       paste(devStage, gType, sidChar, sep="_"))
#str(hDat)
set.seed(3)
(yo <- sample(1:ncol(prDat), size = 10))
##  [1]  7 31 15 12 22 21  5 10 18 19
pairDat <- subset(prDat, select = yo)
#str(pairDat)

plot pairwise gene expression for 10 out of the 100 samples

#png("2by2.png")
pairs(pairDat,
panel = function(...) smoothScatter(..., add=TRUE))

#dev.off()

and for all 39 columns

(yo <- sample(1:ncol(prDat), size = 20))
##  [1] 20 39 38 21 31 29  4 23 28  9  7  1 33  3  6 19 14 36 12 16
pairDat <- subset(prDat, select = yo)
#png("2by2_39.png")
pairs(pairDat,
      panel = function(...) smoothScatter(..., add=TRUE))

#dev.off()

It was pretty slow, even for 10! very slow for 20, and this gave a totally uninformative result. Probably the most samples this can be done for is ~10. Heatmaps seem to be much better for getting a quick look at the big picture, but you can't identify things like tightly correlating genes, like we see for some of the samples here (e.g. samples_4 and sample_10)

library(RColorBrewer)
(yo <- sample(1:nrow(prDat), size = 100))
##   [1] 11356 11180  5100 13575  7739 10070 26637  6047 17342  6217  8427
##  [12] 23540  5180 17086 12552  8012  1432  3098  9400 23964  6864  6375
##  [23] 26249 29724 25265 27244 14102  6716  3825  8369 24418  1724 24019
##  [34]  3123 22934  9119 23012 16173 10839  2769 22724 22755 27014 28898
##  [45] 15409 16432  4896  4922 23513 22459 23448 19564 11305   257 28560
##  [56] 25070  6380 14788 19018 27532   352  7993 13018 24790 26029  7503
##  [67]  9693  9151  5507 20318 22898 20381  6249 21271 18084 10174  1230
##  [78] 12002  2362  9336  9710  2341  4487  4553 27280 20973 24505 19876
##  [89]  6456 13828 21405 24549 29863  5970 17947 12833 19034 12539 26266
## [100]  9574
hDat <- prDat[yo, ]
#str(hDat)
hDat <- as.matrix(t(hDat))
rownames(hDat) <- with(prDes,
                       paste(devStage, gType, sidChar, sep="_"))
#str(hDat)
jBuPuFun <- colorRampPalette(brewer.pal(n = 9, "BuPu"))
#png("heatmap.png")
heatmap(hDat,  Colv = NA, Rowv = NA, scale="none", margins = c(5, 8),
        col = jBuPuFun(256))

#dev.off()

Overall the gene expression looks pretty similar between the KO and wt genotypes. I suppose this is as expected, because Nrl probably only influences the expression of a handful of genes, maybe not even in the subset of 100 that I chose here.

There are a few samples that have overall less intense expression signatures compared to the others (e.g. sample 16), which is probably some kind of experimental artifact..

From here I will cherry pick one or two genes that look like they're different between wt and KO (based on a smaller heatmap, which is easier to look at)

Will update soon....