Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practices for using CellBox on different datasets #55

Open
Mustardburger opened this issue Jul 22, 2023 · 1 comment
Open

Best practices for using CellBox on different datasets #55

Mustardburger opened this issue Jul 22, 2023 · 1 comment
Assignees
Labels
help wanted Extra attention is needed

Comments

@Mustardburger
Copy link
Collaborator

For external users wanting to use CellBox on their own dataset, what is the best practice to train the model? How many total models, differed by the seed, or --working_index, should be trained before the collection of models achieves statistical power? This question follows the Network Interpretation in the Methods section from the original CellBox paper, when 1000 models were trained for downstream analysis. CellBox and its ODE solver is susceptible to suboptimal weight initialization: setting the wrong random seed (--working_index) while keeping other configs and arguments the same can lead to very different results. Therefore, for new users with a new dataset, should they train only one model or multiple models with different random seeds to yield the best performance?

@Mustardburger Mustardburger added the help wanted Extra attention is needed label Jul 22, 2023
@DesmondYuan
Copy link
Collaborator

Thanks for the question. The users are encouraged to bootstrap their training multiple times and check the training stability. The template config provided was finetuned on the dataset we used in the paper and could (AND should) be changed when we apply it to a different dataset.

Another recommended practice is to adjust the model training configuration on training with random partitions first and use the leave-one-out scenario as a way to test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants