Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load Annotations on my own data #58

Open
madeluis opened this issue Feb 11, 2021 · 13 comments · Fixed by #66
Open

Unable to load Annotations on my own data #58

madeluis opened this issue Feb 11, 2021 · 13 comments · Fixed by #66

Comments

@madeluis
Copy link

madeluis commented Feb 11, 2021

Hello,

I have been trying to visualize some tomato data using MoMI-G. I have been successful loading and visualizing all my files (xg and pcf) except for the annotation file. So basically, the circos plot, the feature table, and the SequenceTubeMap are working properly, but the "LinearView:Annotations" sections simply shows "No Data".

I have tried loading the annotation as a gff file, as a bed file, and as a bigbed file, and none of those seem to work. I have also tried to include the annotation file under the "reference" section in the config.yaml file, and I later tried moving it under the static_files section, but again, none of those are working. Here is my current config.yaml file (here I have the annotation in two spots to show the two spots where I've tried including it):

`bin:
vg: "vg"
vg_tmp: "vg"
graphviz: "dot"
fa22bit: "faToTwobit"
bigbed: "bedToBigBed"
reference:
chroms: "static/tomatoCHR.json"
data:
- name: "SL4"
features:
- name: "gene_annotation"
url: "static/ITAG4.0_gene_models.bb"
chr_prefix: "SL4.0ch"
data:

  • name: "tomato_BGV006336"
    desc: "2021-01-25"
    chr_prefix: ""
    ref_id: "SL4"
    source:
    xg: "static/data.xg"
    csv: "static/clean_BGV006336.tomato_graphFile.pcf"
    features: []
    static_files:
    • name: 'Genes'
      url: 'static/ITAG4.0_gene_models.bb'
      viz: 'bigbed'`

Also, the annotations in your current demo page are not working on my end either --any chance that these two issues are related?

I would really appreciate any input on this issue. Thanks in advance!!

@6br
Copy link
Contributor

6br commented Feb 14, 2021

Thank you for using MoMI-G.

Currently, Linear View: Annotation displays genomic features on reference.data.features on config.yaml and it works only if the input is gff3 or bed format. And bigbed is not supported for reference.data.features section.

https://momi-g.readthedocs.io/en/latest/configure_file.html#example

reference:
  chroms: "static/GRCh.json"
  data:
    - name: "hg19"
      features:
        - name: 'gene_annotation'
          url: "test/gencode.v25.basic.annotation.Y.bed"
          chr_prefix: "chr"
    - name: "hg38"
      features:
        - name: 'gene_annotation'
          url: "test/gencode.v25.basic.annotation.Y.bed"
          chr_prefix: "chr"

If you wish to use GFF format, all header lines need to be removed by grep -v "#" in advance.

reference:
  chroms: "static/GRCh.json"
  data:
    - name: "hg19"
      features:
        - name: 'gene_annotation'
          url: 'test/gencode.v26.chr_patch_hapl_scaff.basic.annotation.wo.header.head.gff3'
          chr_prefix: "chr"
    - name: "hg38"
      features:
        - name: 'gene_annotation'
          url: 'test/gencode.v26.chr_patch_hapl_scaff.basic.annotation.wo.header.head.gff3'
          chr_prefix: "chr"

Also, the demo page is now running without annotations due to some technical reason.
Sorry for not documented this.

@6br
Copy link
Contributor

6br commented Feb 17, 2021

Now the No data on demo page is fixed. Thank you for reporting us.

@madeluis
Copy link
Author

madeluis commented Feb 18, 2021

Hello,

Thanks so much for your reply.
I pulled the newest commit of the repository (in case your fix to the demo page was made from the source code and it was affecting me as well), and I changed the file that I am using for the annotations. I tried using a bed file and later I tried using a gff file without headers, as you indicated.

Here is my config.yaml file:

`bin:
vg: "vg"
vg_tmp: "vg"
graphviz: "dot"
fa22bit: "faToTwobit"
bigbed: "bedToBigBed"
reference:
chroms: "static/tomatoCHR.json"
data:
- name: "SL4"
features:
- name: "gene_annotation"
url: "static/ITAG4.0_gene_models_noHeader.gff"
chr_prefix: "SL4.0ch"
data:

  • name: "tomato_BGV006336"
    desc: "2021-01-25"
    chr_prefix: ""
    ref_id: "SL4"
    source:
    xg: "static/data.xg"
    csv: "static/merged.tomato_graphFile.pcf"
    features: []
    static_files: []
    `

Unfortunately, this hasn't resolved the issue.

In addition, I continue to have a problem with the "Linear View: Annotation" in the demo page: when I first load the page, the annotations appear correctly for the loci that is being displayed by default. However, if I select any other loci, it continues to show "No Data".

Please, let me know if you have any further insights into this issue.

Thanks again!

@6br
Copy link
Contributor

6br commented Feb 19, 2021

For config.yaml:

Is it possible to add .bb file on data.features section?

bin:
  vg: "vg"
  vg_tmp: "vg"
  graphviz: "dot"
  fa22bit: "faToTwobit"
  bigbed: "bedToBigBed"
reference:
  chroms: "static/tomatoCHR.json"
  data:
    - name: "SL4"
      features:
        - name: "gene_annotation"
          url: "static/ITAG4.0_gene_models.gff3"
          chr_prefix: "SL4.0ch"
data:
  - name: "tomato_BGV006336"
    desc: "2021-01-25"
    chr_prefix: ""
    ref_id: "SL4"
    source:
      xg: "static/data.xg"
      csv: "static/clean_BGV006336.tomato_graphFile.pcf"
    features: 
      - name: "gene_annotation"
        url: "static/ITAG4.0_gene_models.bb"
        chr_prefix: "SL4.0ch"
    static_files:
      - name: 'Genes'
        url: 'static/ITAG4.0_gene_models.bb'
        viz: 'bigbed'`

There is a way to check if the backend works well or not. For example, in our demo backend, the annotation against chr20:312898-32302312 can be retrieved from demo backend in the following commands:

$ docker run --rm --init -p 8081:8081 momigteam/momig-backend   #wait until `Start server on 0.0.0.0:8081` is displayed.
$ curl "localhost:8081/api/v2/region?format=bed&path=chr1:0-200003"
{"static/ensGene.bb":[{"start_offset":4273,"stop_offset":19669,"id":0,"name":"gene_annotation","is_reverse":null,"attributes":["ENST00000326632","0","-","19669","19669","0","13","92,39,106,44,139,18,198,132,141,147,99,155,273,","0,589,1385,1493,2196,2337,2447,2822,3191,3504,3857,10326,15123,","ENSG00000146556","Q9H7N0_HUMAN"],"value":null},{"start_offset":42911,"stop_offset":44799,"id":1,"name":"gene_annotation","is_reverse":null,"attributes":["ENST00000359752","0","+","44799","44799","0","2","19,107,","0,1781,","ENSG00000197490","null"],"value":null},{"start_offset":52877,"stop_offset":53750,"id":2,"name":"gene_annotation","is_reverse":null,"attributes":["ENST00000379479","0","+","53750","53750","0","1","873,","0,","ENSG00000205292","OR4G11P"],"value":null},{"start_offset":58953,"stop_offset":59871,"id":3,"name":"gene_annotation","is_reverse":null,"attributes":["ENST00000326183","0","+","58953","59871","0","1","918,","0,","ENSG00000177693","OR4F5"],"value":null},{"start_offset":147646,"stop_offset":147749,"id":4,"name":"gene_annotation","is_reverse":null,"attributes":["ENST00000386603","0","-","147749","147749","0","1","103,","0,","ENSG00000209338","null"],"value":null}]}

For demo page:
"No data" is shown if there is no annotation on the selected loci.
For example, when I select a known gene position as loci, annotation will be shown.
screenshot

@madeluis
Copy link
Author

Thanks again for the explanation.

Yes, so everything works with your demo page now, and my backend is working properly (when I make a make a curl request with a region corresponding to a gene, I get the response that I would expect).

Unfortunately, I continue to have trouble with the frontend. This is what I've noticed though:

When I submit a region in your demo page, I can see that there is a request being made to a wig file:
Screen Shot 2021-02-22 at 8 11 58 AM

However, when I submit a region from my frontend (same region I used to test the backend), it looks like a request is not being made:
Screen Shot 2021-02-22 at 8 42 37 AM

Without digging more into your code, I was wondeing if there could be a bug in your GraphWrapper.tsx script (it may not be a bug, but I don't quite follow the highlighted line):
Screen Shot 2021-02-22 at 8 46 43 AM

Since your request is being made to a wig file, I wanted to try that myself; I tried to use a BigWig file but it made no difference. So, I was wondering if my config.yaml was correct in this case:

`bin:
vg: "vg"
vg_tmp: "vg"
graphviz: "dot"
fa22bit: "faToTwobit"
bigbed: "bedToBigBed"
reference:
chroms: "static/tomatoCHR.json"
data:
- name: "SL4"
features:
- name: "gene_annotation"
url: "static/ITAG4.0_gene_models_noHeader.gff"
chr_prefix: "SL4.0ch"
data:

  • name: "tomato_BGV006336"
    desc: "2021-01-25"
    chr_prefix: ""
    ref_id: "SL4"
    source:
    xg: "static/data.xg"
    csv: "static/merged.tomato_graphFile.pcf"
    features:
    • name: "gene_annotation"
      url: "static/ITAG4.0_gene_models.bw"
      chr_prefix: "SL4.0ch"
      static_files:
    • name: 'Genes'
      url: 'static/ITAG4.0_gene_models.bw'
      viz: 'bigwig'
      `
      Also, are you using wig, or BigWig?

Again, thank you very much for all your help!

@6br
Copy link
Contributor

6br commented Feb 23, 2021

In MoMI-G, after loaded subgraph (and visualized on tubemap), query is published for /region?format=bed&path=......
request_bed
This can be reproduced by using MoMI-G backend docker with CHM1 demo.

paths.filter(a => !(a.start === 0 && a.stop !== a.stop))

This code check if the path is valid. Sometimes the start can be zero or the end can be NaN, which might cause error. This line removes such invalid paths.

I recommend using bigbed format for LinearView annotations.

Actually, I found that there is a bug to detect paths in the subgraph and I fixed it. Is it possible to try it on the latest master?

@madeluis
Copy link
Author

Thank you very much for your prompt response.

I pulled the latest commit but I continue to have the same problem, so I digged further into your code and I found where the problem is, but I don't know what is causing it. Let me explain:

With your demo: when I submit a region, I see that 5 requests are being made, and one of them is the /region?format... as you pointed out:
Screen Shot 2021-02-23 at 8 31 40 AM

With the frontend connecting to my server and using my data: when I submit a region, only two requests are being made, and the request for /region?format... is missing:
Screen Shot 2021-02-23 at 8 41 19 AM

I know the backend is serving the request because when I curl that request with that same region I get a response:
Screen Shot 2021-02-23 at 8 46 33 AM
so, it is clear that the problem is that the request from the frontend is never made.

I went back to the GraphWrapper.tsx to debug it, and I found that my path variable has content that looks like this:
Screen Shot 2021-02-23 at 9 20 56 AM
However, the referencePath variable is empty. So it seems that the condition in line 436 is never met:
Screen Shot 2021-02-23 at 9 32 34 AM

I'm pretty sure that this is where the issue is, but I don't know why. Could my data be somehow wrong? Do you see anything off with my path variable?

I don't know if you are open to the idea, but if it is easier to debug this issue live, I am happy to set up a zoom call.

Thank again for your help!

@6br
Copy link
Contributor

6br commented Feb 24, 2021

Thank you very much for your detailed report. I now figure out the mechanism this happens. MoMI-G backend handles prefix chr and put indexOfFirstBase on the response JSON, but had not supported for other chromosome prefixes, which might be specified in config.yaml. I fixed this bug on the MoMI-G backend and deployed it as a docker image. Is it possible to use the latest docker image? I think indexOfFirstBase will be attached to the JSON file from the latest backend.

@madeluis
Copy link
Author

madeluis commented Feb 24, 2021

Thank you so much for the fix, it works now!!

I have one last question though: is it possible to download the sequences for the regions that are on display in the graph view and/or the annotation?

In the demo page, I can see a Download botton:
Screen Shot 2021-02-24 at 3 17 39 PM

However, with my frontend, there is no download option (or perphaps the page doesn't scroll down enough?):
Screen Shot 2021-02-24 at 3 17 52 PM

Is that what is supposed to happen, or is there a bug with how the layout is displayed?

Thank you!

@6br 6br mentioned this issue Feb 25, 2021
@6br
Copy link
Contributor

6br commented Feb 25, 2021

Thank you so much for the fix, it works now!!

Good to hear that!

However, with my frontend, there is no download option (or perphaps the page doesn't scroll down enough?):

The download button was hidden behind the coordinate bar.

bar

I adjusted the bottom margin of the container, so now the download button is shown.

スクリーンショット 2021-02-25 11 15 43

Thank you for asking this!

@6br 6br closed this as completed in #66 Feb 25, 2021
@6br 6br reopened this Feb 25, 2021
@madeluis
Copy link
Author

madeluis commented Mar 1, 2021

Thank you very much for quickly addressing this issue! That worked. However, the .csv file that gets downloaded when I use that botton (both, with the demo page and my frontend) contains no information.

This is what I get with the demo page:
Screen Shot 2021-03-01 at 7 33 10 AM

Does this happen only on my end, or do you see the same? I was wondering if it could be the encription of the file.

Thank you!

@6br
Copy link
Contributor

6br commented Mar 2, 2021

Thank you for letting me know that. I fixed the bug on the latest master. (But the demo page is not fixed yet)

@madeluis
Copy link
Author

madeluis commented Mar 2, 2021

Thank you so much for all your help. Everything works now perfectly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants