API Loader - Optional deletion of existing molecules during Target upload. #26

duncanpeacock · 2021-01-28T16:48:33Z

Source: Frank - Meeting 28/02/2021

Problem:

Currently, when a target set is uploaded, the processing will upsert the molecules in the Target Set file and automatically delete any existing molecules that are not in the file. There is concern about this automatic deletion of molecules from a usability/tracking.

Proposed Solution:

Diamond would like the deletion to be made optional.

A "delete molecules not in target file" flag could be added to the upload_tset page which would be unchecked as standard (forcing the user to make a decision.
If this is unchecked then the molecules would not be deleted
A list of the molecules will be provided in an exception list sent to the email address/upload results page.
For the upload results page alternatively a link could be provided that gave you a list of the molecules affected.

Questions/Thoughts

Concerned about knock effects by removing the link between zip file and targets as it's a basic relationship in the loading process - Need to check all the follow on processing still works and the files in the database are correct - specific concerns:
a) metadata.csv provided with the upload is used to further create further files: hits_ids, sites and alternate names. What to do with these? Metadata.csv will also now be a partial upload so the other files will need to be upserted rather than deleted and recreated (to maintain the link with the target dataset). This makes it more complicated.
b) It might be good to see whether we need these files any more, or whether the data could better be/should be better stored as tables now - if so better to spend the time doing the job properly?
c) I'm also be bit concerned about the .zip file that were uploaded and will be downloadable at the end of the process. The downloadable file will need to reflect the whole Target and we probably need to store the uploaded file for comparison purposes.
Do we also want the validate option to pick these up as exceptions? I would say yes from a users perspective but it's probably a couple of hours extra work.
The email is currently sent from a configured gmail account - for Janssen we need to make this configurable for the Janssen email system. For Diamond, it could also optionally be configured to use the STFC mailer.
The link to give a list of molecules that are not in the file could fall out of the .zip processing i.e. a downloadable link to not-loaded.zip? It won't be stored in the database so we only know by comparing the uploaded file with what is in the DB.
It also occurs to me that we are only comparing the current file with what's in the DB - so effectively only the last update. if we are making database changes, we could also consider a fuller history - or will this be picked up as part of the data provenance project in the future?

duncanpeacock added the enhancement New feature or request label Jan 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Loader - Optional deletion of existing molecules during Target upload. #26

API Loader - Optional deletion of existing molecules during Target upload. #26

duncanpeacock commented Jan 28, 2021

API Loader - Optional deletion of existing molecules during Target upload. #26

API Loader - Optional deletion of existing molecules during Target upload. #26

Comments

duncanpeacock commented Jan 28, 2021

Problem:

Proposed Solution:

Questions/Thoughts