Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pmids #254

Merged
merged 24 commits into from
Jun 2, 2021
Merged

Conversation

mihai-sysbio
Copy link
Member

@mihai-sysbio mihai-sysbio commented Apr 30, 2021

Main improvements in this PR:

The reaction references field is inconsistent. This PR aims to update this field by:

  • removing uniprot ids
  • removing omim ids
  • removing hmdb ids
  • using only pmids
  • formatting the field correctly, ie PMID:\d+ separated by ;
  • highlight broken references by having the field start with ??

I hereby confirm that I have:

  • Tested my code on my own computer for running the model
  • Selected develop as a target branch

@mihai-sysbio
Copy link
Member Author

I'm thinking this PR is easier to review in stages, so I'll take a break for an intermittent review; please feel free to continue the work with more commits on the same branch.

Copy link
Collaborator

@JonathanRob JonathanRob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; thank you for fixing this, @mihai-sysbio!

@haowang-bioinfo
Copy link
Member

haowang-bioinfo commented May 10, 2021

I wonder how's the reference field defined by SBML spec?

@mihai-sysbio
Copy link
Member Author

According to the documentation, looking for publication or pubmed, the closest example I found is:
image

In Human-GEM.xml this is stored as:

      <reaction metaid="R_HMR_3905" sboTerm="SBO:0000176" id="R_HMR_3905" reversible="false" fast="false" fbc:lowerFluxBound="FB2N0" fbc:upperFluxBound="FB3N1000">
        <notes>
          <body xmlns="http://www.w3.org/1999/xhtml">
            <p>Confidence Level: 0</p>
            <p>AUTHORS: PMID:10868354;PMID:12491384;PMID:12818203;PMID:14674758;PMID:15289102;PMID:15299346;PMID:15327949;PMID:15682493;PMID:15713978</p>
          </body>
        </notes>

based on Human-GEM.yml

      - references: "PMID:10868354;PMID:12491384;PMID:12818203;PMID:14674758;PMID:15289102;PMID:15299346;PMID:15327949;PMID:15682493;PMID:15713978"

To me, the format PMID:\d+ is a step forward in the right direction, and it is unclear how much further we can progress at this stage. Therefore, the scope of this PR is limited to bringing all fields to this format.

@haowang-bioinfo
Copy link
Member

@mihai-sysbio I wonder how could "broken references" be determined?

@mihai-sysbio
Copy link
Member Author

@mihai-sysbio I wonder how could "broken references" be determined?

I could not find these "broken" references by PMID nor DOI. Occasionally it is because they are printed books, and they have only ISBN.

@mihai-sysbio
Copy link
Member Author

Perhaps it would make sense to add a - notes: "" field for the reactions where the reference doesn't match the fixed PMID.

@haowang-bioinfo
Copy link
Member

haowang-bioinfo commented May 12, 2021

yes, it's possible to have a field rxnNotes that is a non-mandatory one defined in RAVEN.

@mihai-sysbio
Copy link
Member Author

@Hao-Chalmers I suspect rxnNotes is a Matlab field that is not mapped to the yml format used in the develop branch - or is it?

@haowang-bioinfo
Copy link
Member

@mihai-sysbio this field is possible to be mapped to yml file if needed.

@JonathanRob
Copy link
Collaborator

Another option is to add the rxnNotes field to the reaction annotation file. However, that would mean that the field would not be present in the yml model file.

@haowang-bioinfo
Copy link
Member

The corrections are huge in this PR - this is great! Is it okay to be merged? @mihai-sysbio

@mihai-sysbio
Copy link
Member Author

Is it okay to be merged?

I still have a few hundred corrections, I think. There is a checklist at the beginning of the PR where I'm keeping track of the subtasks. When they're done, I will mark the PR as "ready for review" (non-draft).

@haowang-bioinfo
Copy link
Member

fantastic

@mihai-sysbio mihai-sysbio marked this pull request as ready for review May 31, 2021 06:36
@mihai-sysbio
Copy link
Member Author

mihai-sysbio commented May 31, 2021

I've finished going through the references fields. I haven't created any notes or rxnNotes field - only highlighted relevant cases as broken references. There are 196 such cases in the 12673 reference lines, but since these include repetitions, I would guess the number of references to be looked up to be half that. I propose to raise an issue for fixing the remaining issues, easily identifiable when looking for - references: "??.

@haowang-bioinfo
Copy link
Member

I propose to raise an issue for fixing the remaining issues, easily identifiable when looking for - references: "??.

Great idea, please proceed to making one.

@mihai-sysbio mihai-sysbio mentioned this pull request Jun 1, 2021
3 tasks
@haowang-bioinfo
Copy link
Member

@mihai-sysbio can you merge develop into this PR so that it can be conveniently merge back.

@mihai-sysbio
Copy link
Member Author

@Hao-Chalmers surre, let me know if the above commit didn't do the trick.

@haowang-bioinfo
Copy link
Member

good job! @mihai-sysbio

@haowang-bioinfo haowang-bioinfo merged commit de4bafc into SysBioChalmers:develop Jun 2, 2021
@mihai-sysbio mihai-sysbio deleted the fix/pmids branch June 2, 2021 06:42
@haowang-bioinfo haowang-bioinfo mentioned this pull request Jun 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants