Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'scans' during matchms running #597

Open
anani-a-missinou opened this issue Jan 15, 2024 · 1 comment
Open

KeyError: 'scans' during matchms running #597

anani-a-missinou opened this issue Jan 15, 2024 · 1 comment

Comments

@anani-a-missinou
Copy link

anani-a-missinou commented Jan 15, 2024

Hi dear developers,

We have generated an in-silico HRMS/MS database focused on our model organism. These compounds are structurally described in the literature but are absent in the spectra database.

Can you help me to use you? tool to match our experimental data.

BEGIN IONS
PEPMASS=302.0593882
CHARGE=1+
MSLEVEL=2
NAME=BraChemID_1312 CIF 3
INSTRUMENT=In-silico MS/MS by CFM-ID 4.4.7
ACTIVATION=CID
FILENAME=BraChemDB.mgf
SEQ=..
IONMODE=Positive
ADDUCT=M+H
ORGANISM=BraChemDB
GENUS=Brassica napus var. napobrassica
PI=Antoine Gravot
DATACOLLECTOR=Anani A. Missinou
FORMULA=C14H11N3O3S
SMILES=COc1ccc2nc3n4c(ncc-3c2c1)SCC4C(=O)O
INCHIKEY=NQSHYNHIOMOLTC-UHFFFAOYSA-N
INCHI=InChI=1S/C14H11N3O3S/c1-20-7-2-3-10-8(4-7)9-5-15-14-17(12(9)16-10)11(6-21-14)13(18)19/h2-5,11H,6H2,1H3,(H,18,19)
INCHIAUX=N/A
PUBMED=N/A
SUBMITUSER=A2Missinou
LIBQUALITY=3
CASNUMBER=N/A
171.05529 19.09
184.05054 18.34
198.06619 59.11
200.08184 11.58
212.02769 26.77
216.02261 19.76
226.04334 90.55
228.05899 41.93
230.03826 37.39
230.07464 19.59
242.03826 51.8
254.03826 13.43
256.05391 100.0
256.07167 13.54
258.06956 19.13
270.03317 15.26
272.04882 11.7
...
END IONS

I already installed the latest version of Matchms. But I get an error when I try to run on my file.

(matchms) $ cat run_matchms.py
from matchms.importing import load_from_mgf
from matchms.filtering import default_filters
from matchms.filtering import normalize_intensities
from matchms import calculate_scores
from matchms.similarity import CosineGreedy

https://matchms.readthedocs.io/en/latest/api/matchms.importing.html
file = load_from_mgf("tests/testdata/TOTAL_BraChemDB_Neg_v1-2.mgf")

spectrums = []
for spectrum in file:
spectrum = default_filters(spectrum)
spectrum = normalize_intensities(spectrum)
spectrums.append(spectrum)

scores = calculate_scores(references=spectrums,
queries=spectrums,
similarity_function=CosineGreedy(),
is_symmetric=True)

print(f"Size of matrix of computed similarities: {scores.scores.shape}")

query = spectrums[15] # just an example
best_matches = scores.scores_by_query(query, 'CosineGreedy_score', sort=True)

for (reference, score) in best_matches[:10]:
if reference is not query:
print(f"Reference scan id: {reference.metadata['scans']}")
print(f"Query scan id: {query.metadata['scans']}")
print(f"Score: {score[0]:.4f}")
print(f"Number of matching peaks: {score[1]}")

(matchms) $ pyton run_matchms.py
pyton: command not found
(matchms) $ python run_matchms.py
Size of matrix of computed similarities: (5995, 5995, 2)
Traceback (most recent call last):
File "/mnt/d/project/postdoctoral/brassimet/Constitutive_Metabolite_Diversity/3.annotation/matchms/run_matchms.py", line 39, in
print(f"Reference scan id: {reference.metadata['scans']}")
KeyError: 'scans'

Thank you

@hechth
Copy link
Collaborator

hechth commented Jan 17, 2024

Hi, there is no scans variable in the metadata - if you just want the matching results for example as a CSV table you can use the Galaxy version of matchms to compute your matches: https://usegalaxy.eu/?tool_id=toolshed.g2.bx.psu.edu%2Frepos%2Frecetox%2Fmatchms_spectral_similarity%2Fmatchms_spectral_similarity%2F0.24.0%2Bgalaxy0&version=latest

This tool is also contained in a tutorial: https://training.galaxyproject.org/topics/metabolomics/tutorials/gc_ms_with_xcms/tutorial.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants