Skip to content

aminsagar/COSMiCS

Repository files navigation

COSMiCS2021

These are the files required to reproduce our results in the structure paper "Structure and Thermodynamics of Transient Protein-Protein Complexes by Chemometric Decomposition of SAXS data sets". Note1: This is a preliminary version which and will be upgraded. Note2: COSMiCS used MCR-ALS for decomposition so the MCR-ALS scripts (command line version) should be in MATLAB path. MCR-ALS can be downloaded from http://www.mcrals.info/ A breif description of the files is as follows:

  1. COSMiCS_multi - The main COSMiCS script capable of handling multiple SAXS datasets. It takes care of importing the data and organizing it in matrices to be passed on to the ALS procedure, carries on PCA analysis, lets the user define the Q-ranges for the different representations, generates the "csel" matrices for selectivity restraint and the "vclos" matrices for closure restraints and calls all the other scripts. The script asks for the number of components to be decomposed (default = 3), the convergence criteria (default = 0.1%) and the maximum number of iterations (default=1000).
  2. als_closure2 - The main script for ALS optimization. This is a modified version of the corresponding script from MCR-ALS. Currently, the constraints to be used during the optimization are defined by hand in the "wcons" variable. However, the values can also be passed on from the main COSMiCS_multi script. The values corresponding to the constraints are available in the same script.
  3. closure_multi - This script imposes closure and is a modification of the corresponding script from MCR-ALS so that the stoichiometry of species can be defined. It should be noted that the sequence of stoichiometries should correspond to the sequence of the curves given as initial estimates. For example, if the initial estimates matrix has the curves of monomer followed by dimer then the stoichiometry should be defined as [1,2] and [2,1] in the reverse scenario. This also means that the identity of the curves has to be known beforehand. This can be done, for example, by calculating the radius of gyration of the initial estimates using external programs like the ATSAS suite.
  4. compare2curves - For scaling and comparison of curves.
  5. crearMat_manualpure_rank - This script generates the input matrices with multiple representations of the input SAXS data. This includes generation of Holtzer, Kratkty and Porod representation followed by their scaling. In additions, it generates the matrices of initial estimates. In the currently provided script, it compares all the curves to the first curve of each dataset, finds the most different curves (in terms of chi square) and assigns it as the third initial estimate. The first two initial estimates are the first curves of each dataset. This works for reproduction of the results described in our paper but can be modified to adapt for different systems.
  6. plots - Generates plots !
  7. publishReport_two - Publishes report of the optimization procedure including various plots, metrics of optimization, fits etc. The current script handles decomposing two datasets seamlessly but might need modifications for different numbers of datasets.

General Recommendations

  1. It is recommended to remove initial points of the SAXS curves if they show significant deviation from normal behavior due to inter-particle interactions. COSMiCS-multi asks for this and can remove the user-defined number of points from all the curves.
  2. The selection of maximum Q value for different representations (especially Holtzer and Kratkty) should be done in a way to include the peak and exclude the high-Q noisy data. According to our tests, the exact value is not critical but not including the peak or including too much of the noisy data can negatively affect decomposition.
  3. The current scaling protocol is appropriate for the kind of data described in the paper i.e. complexation or any data not involving radical changes in intensity. In case of experiments involving large changes in intensity, e.g. amyloid formation, scaling at high-Q might yield better results.

In case of any problems, please write to [email protected]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages