Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with .obo files still persists in new release #11

Open
VolkerH opened this issue Feb 1, 2022 · 8 comments
Open

Problem with .obo files still persists in new release #11

VolkerH opened this issue Feb 1, 2022 · 8 comments

Comments

@VolkerH
Copy link

VolkerH commented Feb 1, 2022

Hi,

I left a comment with the already closed issue #10 , but depending on your notification settings you may not get notifications for comments in closed issues. Therefore I am opening a new issue.

As mentioned at the end of #10, unfortunately the new relase 2.1.1 doesn't solve the issue with the Download of the .obo files.
I still get an error message. The workaround of copying an existing Ontology folder with working .obo files into the working directory still works,

@nfransaert
Copy link

Hi,

I see this issue has not yet been addressed, and I get the same error when trying to convert grd to imzML (using 2.1.1).

PS G:\My Drive\PhD\AIMS\jimzMLConverter-2.1.1> java -jar jimzMLConverter-2.1.1.jar imzML -p "D:\ToF-SIMS\Koen-Nico\aims-data\itmToGRD-testData\GRD\itm-header\i220617g_vds1_4.itm.properties.txt" "D:\ToF-SIMS\Koen-Nico\aims-data\itmToGRD-testData\GRD\GRD-and-header\i220617g_vds1_4.itm.grd"
dec 21, 2022 11:06:50 AM com.alanmrace.jimzmlconverter.MainCommand convert
INFO: Converting file D:\ToF-SIMS\Koen-Nico\aims-data\itmToGRD-testData\GRD\GRD-and-header\i220617g_vds1_4.itm.grd
dec 21, 2022 11:06:50 AM com.alanmrace.jimzmlconverter.MainCommand convert
INFO: Detected ION-TOF GRD file
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        at java.lang.String.substring(Unknown Source)
        at com.alanmrace.jimzmlparser.obo.OBO.<init>(OBO.java:115)
        at com.alanmrace.jimzmlparser.obo.OBO.<init>(OBO.java:119)
        at com.alanmrace.jimzmlparser.obo.OBO.<init>(OBO.java:119)
        at com.alanmrace.jimzmlparser.obo.OBO.loadOntologyFromURL(OBO.java:244)
        at com.alanmrace.jimzmlparser.obo.OBO.getOBO(OBO.java:173)
        at com.alanmrace.jimzmlconverter.ImzMLConverter.getOBOTerm(ImzMLConverter.java:209)
        at com.alanmrace.jimzmlconverter.ImzMLConverter.<init>(ImzMLConverter.java:95)
        at com.alanmrace.jimzmlconverter.GRDToImzMLConverter.<init>(GRDToImzMLConverter.java:82)
        at com.alanmrace.jimzmlconverter.MainCommand.convert(MainCommand.java:286)
        at com.alanmrace.jimzmlconverter.MainCommand.main(MainCommand.java:186)

The issue is still a faulty "pato.obo".

PS G:\My Drive\PhD\AIMS\jimzMLConverter-2.1.1\Ontologies> more .\pato.obo
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://raw.githubusercontent.com/pato-ontology/pato/master/pato.obo">here</a>.</p>
<hr>
<address>Apache/2.4.41 (Ubuntu) Server at purl.obolibrary.org Port 80</address>
</body></html>

The main goal of this comment is to draw attention toward the issue as the converter is practically unusable at this stage (for IONTOF data at least), and previous discussions seemed to point toward a fairly simple fix.

Secondly, for now, I would be happy with a temporary workaround: @VolkerH you mentioned copying an existing Ontology folder is a workaround. I tried downloading the pato.obo from https://raw.githubusercontent.com/pato-ontology/pato/master/pato.obo but when I run the converter the ontology folder is automatically updated with the faulty pato.obo . Do you have any suggestions?

Any response would be greatly appreciated!

@VolkerH
Copy link
Author

VolkerH commented Jan 9, 2023

Hi @nfransaert

I've been on holiday for the last three weeks. If you haven't managed to implement a work-around meanwhile, I can look how I solved this (has been a while) and send instructions.

@VolkerH
Copy link
Author

VolkerH commented Jan 9, 2023

@nfransaert ,

I just checked how I solved this.
In our application, we call jimzmlconverter from Python. I wrote a python function that

  • creates a temporary directory
  • copies a bundled Ontologies folder with multiple .obo files into the temporary folder
  • changes the working directory to the temporary folder
  • calls jimzmlconverter as a separate process
  • cleans up the temporary folder

I can't see what you did differently or why it wouldn't work for you.

I can share a few code snippets (incomplete):

Context manager that does the steps above:

import os
import shutil
from contextlib import contextmanager
from pathlib import Path
from typing import Optional, Union

import spacem_maldi.Ontologies
from importlib_resources import files

...

@contextmanager
def tmp_dir_with_ontologies():
    """Change working directory to a temporary directory that contains the ontology files

    This is a workaround for failed downloading of .obo files:
    https://github.com/AlanRace/imzMLConverter/issues/10

    jimzmlconverter looks for .obo files in a subfolder 'Ontologies' below the working
    directory. We create a temporary directory including such a subfolder which we
    populate with bundled .obo files. When jimzmlconverter is executed in such a working
    directory it won't try (and fail) to re-download the ontologies.

    When exiting the context manager, the temporary directory is removed.
    """
    import tempfile

    previous_work_dir = os.getcwd()
    work_dir = tempfile.TemporaryDirectory()
    print(work_dir)
    # we bunde the Ontologies folder in the python package. You can just copy the folder from 
    # a known location on the filesystem
    shutil.copytree(files(spacem_maldi.Ontologies) / ".", Path(work_dir.name) / "Ontologies")
    os.chdir(work_dir.name)
    try:
        yield
    finally:
        work_dir.cleanup()
        os.chdir(previous_work_dir)

Using the context manager:

import subprocess 
    
with tmp_dir_with_ontologies():
       subprocess.popen(....) # call jimzmlconverter as external process

@VolkerH VolkerH closed this as completed Jan 9, 2023
@VolkerH VolkerH reopened this Jan 9, 2023
@VolkerH
Copy link
Author

VolkerH commented Jan 9, 2023

Here is the Ontologies folder I bundle with our python package
Ontologies.zip

(Note: closed the issue by accident, therefore reopened)

@nfransaert
Copy link

Hi @VolkerH ,

Thank you for your elaborate instructions. I got it to work using your supplied Ontologies folder. I suspect that I was still missing a .obo file which led to the creation of the Ontologies folder, even though I manually downloaded the pato.obo.

I tested the conversion with the test.grd and test.properties.txt files provided in this repo, and this worked fine (it created the .ibd, .imzML and .imzML.tmp.ibd). However, when trying to convert my actual data (.itm of ~600 MB and .grd of ~3.7 GB), the program takes about 2 hours to convert the files.

What was your experience with the time it took the converter to convert your files? Do you think this is the expected time it takes for these kinds of datasets, or that something else is going on in my case specifically?

Thanks again!

@VolkerH
Copy link
Author

VolkerH commented Jan 25, 2023

Hi @nfransaert ,

I can't really give good advice on the speed. I set this up for users and haven't touched it for a year. I seem to remember that it was not super-fast (also tens of minutes, depending on machine it is running on and dataset size).

@nfransaert
Copy link

@VolkerH

Ok, thank you again for sharing your workaround and for helping me out!

@Gscorreia89
Copy link

Hi,

I have recently tried to use this converter but ran into these same issues caused by ontologies. After some debugging, I think I found the underlying problem and a fix. This is caused by the dependency of PSI-MS-CV ontology on the STATO ontology introduced 2 years ago: HUPO-PSI/psi-ms-CV@1eb58b8

The current jmimzMLParser contains functionalities to parse ontologies in the .obo format, but STATO is only available in .owl. The parser automatically downloads STATO.owl and fails because it cannot handle .owl. A definite fix for this problem would be to handle the ontology format and add features for parsing .owl.

As a quick fix I have made a new .jar using the ontologies bundled by @VolkerH which seems to work so far. I have also tried to make one with the latest version of all ontologies after converting STATO.owl to .obo, but it seems this ontology has some properties that break the .obo format and cannot be converted (at least using ROBOT)

@AlanRace please let me know if you would like more details, happy to help fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants