Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSS/ promoter regions #4

Open
ivirshup opened this issue Apr 24, 2023 · 2 comments · May be fixed by #34
Open

TSS/ promoter regions #4

ivirshup opened this issue Apr 24, 2023 · 2 comments · May be fixed by #34
Assignees
Labels
enhancement New feature or request P1☕️ Medium priority
Milestone

Comments

@ivirshup
Copy link
Member

Description of feature

Retrieve transcription start sites for genes for use with ATAC data.

This could be similar to the GenomicFeatures::promoters function (description available in this vignette) which:

The promoters function computes a GRanges object that spans the promoter region around the transcription start site for the transcripts in a TxDb object. The upstream and downstream arguments define the number of bases upstream and downstream from the transcription start site that make up the promoter region.

This could be done using bioframe.expand (though that is currently not strand aware: open2c/bioframe#144). This could instead by done with ibis with: ifelse

@ivirshup ivirshup added enhancement New feature or request P1☕️ Medium priority labels Apr 24, 2023
@emdann
Copy link
Member

emdann commented Apr 28, 2023

  • Add option to subset only to canonical transcripts (as indicated in genes table), and add gene ID

@emdann emdann self-assigned this Apr 28, 2023
@emdann
Copy link
Member

emdann commented Apr 28, 2023

Notes on filtering behaviour: in bioconductor the filtering is done at the level of transcripts, before the promoter sequences are defined.

Example: if promoter is within filtered range, but the transcript is not, then the promoter is not returned:

> transcripts(EnsDb.Hsapiens.v86, filter=GRangesFilter(GRanges('1:9000-12000'), type='any'))
GRanges object with 1 range and 6 metadata columns:
                  seqnames      ranges strand |           tx_id           tx_biotype tx_cds_seq_start tx_cds_seq_end         gene_id         tx_name
                     <Rle>   <IRanges>  <Rle> |     <character>          <character>        <integer>      <integer>     <character>     <character>
  ENST00000456328        1 11869-14409      + | ENST00000456328 processed_transcript             <NA>           <NA> ENSG00000223972 ENST00000456328
> promoters(EnsDb.Hsapiens.v86, filter=GRangesFilter(GRanges('1:9000-12000'), type='any'))
GRanges object with 1 range and 6 metadata columns:
                  seqnames     ranges strand |           tx_id           tx_biotype tx_cds_seq_start tx_cds_seq_end         gene_id         tx_name
                     <Rle>  <IRanges>  <Rle> |     <character>          <character>        <integer>      <integer>     <character>     <character>
  ENST00000456328        1 9869-12068      + | ENST00000456328 processed_transcript             <NA>           <NA> ENSG00000223972 ENST00000456328
  -------
  seqinfo: 1 sequence from GRCh38 genome

With within filtering:

> transcripts(EnsDb.Hsapiens.v86, filter=GRangesFilter(GRanges('1:9000-12000'), type='within'))
GRanges object with 0 ranges and 6 metadata columns:
   seqnames    ranges strand |       tx_id  tx_biotype tx_cds_seq_start tx_cds_seq_end     gene_id     tx_name
      <Rle> <IRanges>  <Rle> | <character> <character>        <integer>      <integer> <character> <character>
  -------
  seqinfo: no sequences
> promoters(EnsDb.Hsapiens.v86, filter=GRangesFilter(GRanges('1:9000-12000'), type='within'))
GRanges object with 0 ranges and 6 metadata columns:
   seqnames    ranges strand |       tx_id  tx_biotype tx_cds_seq_start tx_cds_seq_end     gene_id     tx_name
      <Rle> <IRanges>  <Rle> | <character> <character>        <integer>      <integer> <character> <character>
  -------
  seqinfo: no sequences

@emdann emdann linked a pull request Apr 28, 2023 that will close this issue
1 task
@ivirshup ivirshup added this to the 0.1.0 milestone May 15, 2023
@ivirshup ivirshup modified the milestones: 0.1.0, 0.2.0 Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P1☕️ Medium priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants