-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metabolomics hackathon: MS2 spectra matching for metabolite identification #13
Comments
Dear @mmattano, I am happy to inform you that your proposal has been selected for the DevMeeting2023! Participants will decide which hackathon to join after the pitch on Monday. Best, |
Maybe Muyao and Lawrence could leave a short comment here so they also become participant of this issue! THX! |
Thank you @tobiasko |
Thanks @tobiasko |
Hello everyone, I just created a slack workspace for the DevMeeting and a channel named metabolomics for this hack. You should receive an invite to join by email. Best, |
Summary paragraph During the metabolomics related hackathon, spectral similarity scoring was explored. In order to identify a metabolite from an MS1 or MS2 spectrum, different scores are applied to match the spectrum in question to a database entry or, more commonly, an in-house library. Currently in the field, the cosine similarity score is most frequently used. Here, we set up a pipeline to compare multiple different ways to score spectral similarity and an array of variations or their respective input parameters. The data that was specifically prepared for the hackathon also allowed for statistics on false positives, false negatives, etc. Furthermore, we set up systems to test the robustness of these scores to intensity perturbations, which is very common when dealing with biological samples, and tested a possible correlation between structural- and spectral similarity. |
MS2 spectra matching for metabolite identification
Abstract
One major open topic in untargeted metabolomics is identifying unknown compounds from mass spectra. As MS1 comparisons can be ambiguous (especially for small molecules), we need to look at MS2 spectra, and compare them to public MS2 databases, to differentiate compounds in the same mass range.
Currently, the best performing methods for compound identification are GNPS and Sirius. They provide a user with a list of potential compounds, but in some cases the uncertainty is very high or multiple candidates are suggested, making the downstream analysis labor intensive. GNPS improves their predictions by using molecular networks and taking biological information into account. Sirius improves their predictions by comparing structural similarity of the compounds.
We would like to set up a novel system, with modular parts that can be tested separately. Each aspect of the pipeline can be improved/modified individually, and multiple methods can be combined as an ensemble. In doing so, this can also serve as a benchmark of existing scoring and matching functions and a testing playground for novel ideas.
Project Plan
The general purpose is to have a(n automated) workflow for MS2 spectra matching that does not just rely on cosine similarity scoring. Subsequently, we would like to
Technical Details
Main language: Python
Workflow includes: GNPS, SIRIUS
GitHub
Contact Information
Members of the metabolomics research group lead by Thomas Moritz at the NNF Center for Basic Metabolic Research, Faculty of Health Research, University of Copenhagen
Matthias Mattanovich ([email protected])
Muyao Xi ([email protected])
Lawrence Egyir ([email protected])
The text was updated successfully, but these errors were encountered: