-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making sense of internal fragment ions #9
Comments
Dear @veitveit , I am happy to inform you that your proposal has been selected for the DevMeeting2023! Participants will decide which hackathon to join after the pitch on Monday. It would be good if Arthur could also leave a short comment so he becomes part of this issue. Best, |
Hello everyone, I just created a slack workspace for the DevMeeting and a channel named internal-fragment-ions for this hack. You should receive an invite to join by email. Best, |
Unfortunately I won't be able to join the Developers' Meeting, but I'll quickly plug the new version of spectrum_utils, which makes it easy to annotate internal fragment ions (and other ion types). See an example and some more information in a recent preprint. Maybe this can be useful. 🙂 |
Hi Wout, |
Yeah, this is new in v0.4.0, so make sure to update. |
Hi everyone, |
Short summary paragraph of hackathon We created a workflow for exploring internal ions from raw spectra and identified spectra given in various formats. A comprehensive nomenclature for internal ions allows their precise definition and the calculation of their masses. The nomenclature was implemented into a new tool for annotating fragment ions entitled fragannot. This tool output fragment annotation as a json file. This json file is then read by Fragment Explorer to create fragment centric and spectrum centric statistics as well as multiple interactive visualizations. |
Title
Making sense of internal fragment ions
Abstract
Peptide identification from fragment mass spectra uses only part of the contained information. Here, internal fragment ions, i.e. peptides with both termini cleaved, have a high potential to provide further evidence about peptide identity. Despite the option to include internal ions in several database search engines, their actual use has so far been explored only poorly. A big challenge lies in the large number of possible ions, and thus the difficulty in distinguishing them from background signals or other fragment ions. This hackathon project aims to shed more light into the applicability of internal ions by creating a framework to determine their characteristic patterns in MS data. We will provide statistics and extensive visualizations for internal ions in a given data set. For that we will employ both raw data files and identifications from a database search. This framework will establish the grounds for the detection and utilization of characteristic internal ions in a dataset, explore potential “fragment motifs”, and facilitate the distinction of actual internal ions from background noise. A clearer understanding and exploration of internal fragmentation will channel future efforts towards a more extensive use of them in MS data processing leading to higher peptide identification rates.
Project Plan
We suggest the following tasks for creating and testing the framework:
Nomenclature and definitions: Given the complexity and the large combinatorics of internal fragment ions, we will discuss and stringently define the nomenclature. Existing knowledge and nomenclatures will be assessed for their usability.
Data sets: Selection of about ten data sets from different MS technologies that will be used for testing and exploration. This will include bottom-up and top-down approaches, as well as different fragmentation types and acquisition methods.
Implementation: We plan to take advantage of the pyteomics tools for reading files and spectra as well as of libraries such as spectral utils to extract fragment ions. Visualizations and further analysis will be in python and/or R depending on the participants’ background.
Assessment and testing: Different statistical measures and motif algorithms will be tested and discussed. Interactive visualizations will be used to conveniently explore different subsets of one or multiple MS runs.
Software: We expect to achieve developing a prototype that can process any data set given by standard file formats mzML and mzid, and optional widely used formats like mgf and pepxml.
These tasks will be discussed on the first day prior to their implementation. Depending on the skills and interest of the participants, we may define working groups for addressing them in the following days.
Technical details
The programming language(s): Python and/or R. For faster implementations, we might collaborate with the hackathon focussing on Rust implementations.
Existing software that will be featured: python libraries: pyteomics, spectrum-utils
(Public) datasets that will be used and their availability
Given the available ground truth and the availability of different fragmentation types, we might use the proteometools (http://www.proteometools.org/)
Contact information
Arthur Grimaud
Protein Research Group
Department for Biochemistry and Molecular Biology
University of Southern Denmark
Campusvej 55
5230 Odense M / Denmark
[email protected]
Veit Schwämmle
Protein Research Group
Department for Biochemistry and Molecular Biology
University of Southern Denmark
Campusvej 55
5230 Odense M / Denmark
[email protected]
The text was updated successfully, but these errors were encountered: