Skip to content

Latest commit

 

History

History
195 lines (189 loc) · 8.77 KB

TODO.org

File metadata and controls

195 lines (189 loc) · 8.77 KB

#+STARTUP lognoteredeadline

Geoms

geom_ideogram

Make this an actual geom

Make scale_fill_giemsa for coloring by cytoband

Arch in the end instead of just rectangle to make more pretty.

Highlight function.

multiple geoms should be supported.

Layout

layout_margin

add layout_margin to insert a plot as row of a track.

arrange all plots with a single legend.

Crazy idea: what about putting the plots into a single column of a DataFrame, with extra factors for grouping them? I think we talked about a similar (but different) idea with the plot templates in visnab. Then, have a geom_plot that will draw plots (with the center X/Y coming from those grouping factors) so that they do not overlap. Faceting would need to be supported, but the faceting modes currently in ggplot might not work so well, because every plot needs to be the same size. This would end up wasting space. Might want a special class for this, like PlotFrame. Then have an autoplot method for it. I am not sure if it would always be possible/desirable to unify the legends.

Manhattan plot, with highlighted gene.

Statistics

stat_views

generate grouping variable by view intersection

shift to coordinates relative to some origin (default: start)

stat_relative_position

Question: is this really a coordinate system or a stat? My understanding is that a coordinate system changes the data->pixel mapping but it does not change the data itself. So coord_truncate does not change the coordinates (as labeled in the axis), it just squishes stuff together in the plot. In this case though, we need the X axis to be the same across many different origins/tracks, so the coordinates need to be transformed through a ‘stat’. Right?

need a transformation method for GRanges in biovizBase.

Faceting

support facet = gr ~ . where gr is just GRanges

Theme

theme_pack_panels to pack facetted plots and make it more compact.

Scales

scale_fill_fold_change, blue to white to red

global way to make sure color is not NA when plot rectangles?

switch label to right

X tick labels

It needs to be more like Gviz. If the tick labels are over 1mb, use mb as the unit, else, use kb, unless less than 1kb, then use bp. Those long numbers are tough to read.

remove stepping label

resolution to figure out the stepping “buffer”

unequal transformation in circular view!

Autoplotting

autoplot,Matrix

label by row names and align by column names of matrix

consider row names and label them automatically

autoplot,TxDb

gap.geom need to be supported and use direction(arrow).

x lab should be a right default

smart parsing for names. eg. gene_id(tx_id)

autoplot,ExpressionSet

think about heatmap with phenotpe plot as margin?

autoplot,SummarizedExperiment

Focus should be on multivariate (multiple sample) plots, like ExpressionSet. This would include parallel coordinate plots and scatterplot matrices. If those plots are by-row, i.e., the variables correspond to ranges, then the data-linked-to-ranges plots would work. If the variables are the samples, the pcp/splom could be a margin plot, where each track shows something for each sample in genomic context. Or in the case of the splom, we could use one triangle for the traditional scatterplot and the other triangle would be something else incorporating range information.

As a first step, we could just make this method behave just like autoplot,ExpressionSet. Then come up with clever ways of incorporating the range information.

Make equivalent to autoplot,ExpressionSet

Support data-linked-to-range plots

Facet by sample in linked plots and incorporate splom/pcp in margin?

autoplot,VCF

just to make it to work again.

autoplot,Seqinfo

Grabs cytoband information automatically

put data in ggplot() first

Protein space

map() idea, data granges and exon granges, linked plot

There is a similarity, I think, between the ideogram and this idea. The ideogram is drawn over the entire chromosome but then somehow it knows to draw a red rectangle around the region being plotted below. That currently works for only a single range, but it could be extended for multiple ranges. Those ranges would be assumed to be directly adjacent in the bottom track, and lines would be drawn from the rectangle sides down to the breakpoints. I think visnab did this line drawing for the ideogram (single range only).

We might need a new geom, maybe called geom_splice, that delegates to another geom (geom_ideogram, geom_alignment, etc) and then draws lines from sub-regions of the global space down to adjacent, spliced regions. The bottom end points of those lines would somehow depend on the coordinate system, while the top end points somehow use the global coordinates. For the linear coordinate system, the lines simply go to the X axis limits. We would then have a coord_splice that does the necessary removal of gaps, with the structure stored in a GRanges. coord_truncate_gaps is really just a special case of coord_splice, where the exons have been (invisibly) extended a little. So maybe we could replace that with coord_splice and add a parameter for the buffer width.

For protein space though, it sort of no longer makes sense to speak in genomic coordinates. Instead, we have protein coordinates that start at 1, so that requires a ‘stat’ transformation similar to that in stat_views. So sometimes we want a coord_splice, other times a stat_splice, depending on whether we still want global, genomic coordinates on the X axis. They should share a lot of code.

This sounds a bit involved, but I think it’s really important for biological plotting.

Can parse data from uniprot automatically and it’s easy actually.

What sort of data would we parse? This is probably the domain of some other package.

Documentation

vignettes

update and check manual to make sure it’s the latest.

bioc2012

a geom/stat method for associate stat with geom automatically, vice versha.

for example, boxplot geom with stat aggregate automatically

a factorized general theme for every object

so fix the autoplot,GRanges use scale_x_sequnit.

IRanges 0 width and 1 width

xlim problem, override this problem

getIdeogram should be built in with autoplot, for example,

when use autoplot,seqinfo, and with cytoband = TRUE, and provide genome names, need to download that automatically.

global setting

arches link region to region.

Todo from Michael’s email

When I pass “which = list(tx_id = …)”

to autoplot,TxDb, it shows me the region containing that tx_id, instead of just showing that exact transcript. I think that’s a little surprising. Any reason why you do this?

autoplot,BamFile

ignores the ‘which’ argument when method = “estimate”. Btw, I fixed a bug in the coverage estimation when there were no reads on a chromosome. It is also debatable as to whether we want to use method = “estimate” by default. People do not know that it is an estimate.

autoplot,BamFile

does not use the kb/Mb/etc labels. The X axis label is just “Genomic Position” when method = “raw”.

The kb/Mb/etc labels

need to use special formatting, otherwise the trailing zeros are dropped off, like 120.768, 120.77 [missing zero], 120.772.

Same thing goes for autoplot,BSgenome:

it is using old scales and labels. And it would be nice to get rid of the “seqs” label over the legend. And I’m not sure if we even need the legend when using the text geom. It’s kind of weird looking.