Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: I/O of SBO terms from SBML file #196

Closed
2 tasks done
edkerk opened this issue Nov 29, 2018 · 12 comments
Closed
2 tasks done

feat: I/O of SBO terms from SBML file #196

edkerk opened this issue Nov 29, 2018 · 12 comments
Labels
fixed in develop This issue is fixed and pushed to develop branch. Will be closed when fix appears in master branch.
Milestone

Comments

@edkerk
Copy link
Member

edkerk commented Nov 29, 2018

Description of the issue:

  • RAVEN is not able to I/O SBO terms that are specified in SBML files.

Reproducing this issue:

SBO terms are included in the SBML file as following (similar for metabolites and reactions):

<species boundaryCondition="false" constant="false" ...
   hasOnlySubstanceUnits="false" id="M_s_0001" name="(1-3)-beta-D-glucan" ...
   metaid="M_s_0001" sboTerm="SBO:0000247" compartment="ce" ...
   fbc:charge="0" fbc:chemicalFormula="C6H10O5">

When loaded in Matlab using TranslateSBML, SBO terms are easily accessible at (similar for metabolites and reactions):

modelSBML.species(:).sboTerm

Parsing this in exportModel and importModel should be relatively straight forward, and should be stored in the metMiriams and rxnMiriams structures.

Note that ravenCobraWrapper does already have the option to parse SBO terms from the MIRIAMS structures to their COBRA counterpart fields (metSBOTerms).

System information

  1. RAVEN version: all
  2. Operating system: all

I hereby confirm that I have:

@haowang-bioinfo
Copy link
Member

should be stored in the metMiriams and rxnMiriams structures.

COBRA deals with SBO terms in a different way by using metSBOTerms and rxnSBOTerms fields (according to yeast-GEM v8.3.3). Since there is an ongoing issue toward aligning RAVEN and COBRA model fields (#184), maybe good to have a discussion there about weather storing in xyzMiriams or separate fields before the implementation.

@edkerk
Copy link
Member Author

edkerk commented Dec 12, 2018

@Hao-Chalmers For merging the model fields, it was proposed to use the COBRA version, taking advantage of their new getModelFieldsForType and similar functionality. So solving this issue would indeed not be relevant for future versions of RAVEN that directly work on COBRA models, but it would still be good to fix this for the meanwhile for the classical RAVEN model format. It's not so hard to code, will dedicate an hour to before the end of the year.

@JonathanRob
Copy link
Collaborator

JonathanRob commented Aug 16, 2019

This feature is currently of interest to me, so I implemented the changes into exportModel and importModel (now on feat/SBO_SBML_IO branch).

Regarding changes to importModel, I originally planned to parse SBO terms from the .sboTerm fields of the SBML file (for metabolites, reactions, genes, and compartments), and add them to the corresponding miriam fields in the model structure. However, this leads to a situation where if a user exports a model (with no SBO terms specified in any miriam fields), then re-imports that SBML, the model will now have the default SBO terms (assigned in exportModel) loaded into all the miriam fields.

This seemed unnecessary to me, so I left the original formulation which will only import SBO terms from the SBML if it is part of the .annotation field (i.e., only if the SBO term was included in the miriam field when exported). This results in miriam fields that remain identical after export/reimport cycle, which I think is preferred.

I also have a quick question: The default SBO term assigned to genes by exportModel is SBO:0000252, which corresponds to "polypeptide chain". Is there a reason we are not using SBO:0000243, which corresponds to "gene"?

@simas232
Copy link
Collaborator

@JonathanRob, feel three to change the default SBO terms in exportModel to the most relevant ones. When I implemented OutputSBML into exportModel two years ago, I think I used SBO terms which were considered by COBRA developers during that time.

@edkerk
Copy link
Member Author

edkerk commented Aug 16, 2019

Perhaps check if particular SBO terms are part of a test in memote. If we'd like to use different SBO terms, we should then also advocate this with memote people (and COBRA?).

Regarding the I/O function you introduced, I'll check it out. I think it's also important that it can deal with non-RAVEN models, so where do COBRA- and cobrapy-exported models store the SBO terms in the XML file?

@JonathanRob
Copy link
Collaborator

@edkerk Ah, I should have mentioned, memote uses SBO:0000243 for genes, which is what I am suggesting to change it to.

But good point regarding functionality with COBRA-exported models, I will probably need to add a few things for that.

@edkerk
Copy link
Member Author

edkerk commented Aug 16, 2019

@JonathanRob In that case, definitely use SBO:0000243 for genes!

@edkerk edkerk added this to the v2.2.3 milestone Aug 16, 2019
@simas232
Copy link
Collaborator

@JonathanRob, regarding SBO terms parsing in importModel, we may add the check for uniqueness of SBO terms for each entity type separately. For instance, if there are at least two unique SBO terms between all metabolites, we may consider importing all SBO terms into metMiriams field. However, if the same SBO terms are used for all entities of the same type (e.g. genes), we may neglect them, since they would be added again to the model upon exportModel.

On the other hand, if we agree to implement this check, the changes would be made to exportModel as well. That is, there would be a check, whether xyzMiriams contains SBO terms, and if it does, then it would neglect the default SBO terms and use the ones available in xyzMiriams. But in such case, the written SBML file would have duplicate information (SBO terms in .annotation and .sboTerm), perhaps not the best idea after all.. Unless we do not write SBO terms to .annotation during the model export.

@edkerk
Copy link
Member Author

edkerk commented Aug 16, 2019

ravenCobraWrapper should also be updated to parse between geneSBOterms and geneMiriams. For rxns and mets it should already be working.

@JonathanRob
Copy link
Collaborator

@simas232 I think a check for uniqueness of SBO terms when importing could be a good option, since the user probably doesn't really need the terms if they're identical for every entry of a particular entity.

As for the exportModel aspect, this check is essentially already implemented; i.e., the list of all SBO terms is first populated with the default values, and they are overwritten if an SBO term is found in a corresponding xyzMiriams field.

Regarding the duplication of SBO term information in the SBML file (present in both .annotation and .sboTerm fields), maybe it would be better to remove it from the .annotation fields altogether. It seems that Cobra-style models anyway use .sboTerm instead of .annotation.

@simas232
Copy link
Collaborator

Nice! Okay, then we should try to implement these changes to importModel and exportModel accordingly and finally solve this issue.

@JonathanRob
Copy link
Collaborator

JonathanRob commented Aug 17, 2019

  • exportModel now only adds SBO terms to the SBML .sboTerm fields, instead of duplicating the information in the .annotation fields. I have also updated the default gene SBO term to SBO:0000243, which is what memote uses.
  • importModel now reads SBO terms from the SBML .sboTerm fields, which should be compatible with both RAVEN- and COBRA-generated SBML files. Note that SBO terms are not imported if they are all identical for a given entity (e.g., if all metabolites have the same SBO term, it will not be imported into the model structure).
  • ravenCobraWrapper now converts gene SBO terms between the .geneMiriams and .geneSBOTerms fields of RAVEN and COBRA model structures, respectively.

I'll soon make a PR to implement these changes.

@edkerk edkerk added the fixed in develop This issue is fixed and pushed to develop branch. Will be closed when fix appears in master branch. label Oct 2, 2019
@simas232 simas232 closed this as completed Nov 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed in develop This issue is fixed and pushed to develop branch. Will be closed when fix appears in master branch.
Projects
None yet
Development

No branches or pull requests

4 participants