Model training based on pretrained classification model does not work as expected #916

curtinmjc · 2024-11-06T22:14:05Z

Search before asking

I have searched the HUB issues and discussions and found no similar questions.

Question

Background:

I have a large dataset that exhausts available Google Colab System RAM even with the A100 runtime so I decided to try splitting the dataset into multiple pieces.
To eventually have a model that includes the entire dataset I thought I could use the Ultralytics HUB capability of basing a new model on a previously trained model.

What I did and learned:

In Train a Model - Step 2 of 3 (1st screenshot) I selected the Custom tab and picked my previous model that was trained on the first dataset. Doing that displayed a panel that showed the prior model was Pretrained = Yes. And the Advanced Model Configuration for the new model had the Pre-trained toggle button On (2nd screenshot).
I then went through the steps of starting new model training using Google Colab (3rd screenshot).
After training started I went back to the Ultralytics model web page which showed that training had started. It also showed the training parameter settings which included Pretrained = No (4th screenshot).
After training of the new model was complete I did inference testing using the Ultralytics model's Preview tab. In all cases the images associated with the original pretrained model (the "parent" of the new model) were incorrectly identified as an image used in training the new model. I.e. the "parent" model's trained images did not flow through to the new model.

Additional

Questions:

Should the images of the pretrained "parent" model flow through to the "child" model?
Can you explain why the model's page (4th screenshot) has Pretrained = No when the parameters specified in model creation step 2 of 3 has Pretrained = Yes?

UltralyticsAssistant · 2024-11-06T22:14:36Z

👋 Hello @curtinmjc, thank you for raising an issue about Ultralytics HUB 🚀! Your contribution helps us improve and address any potential issues.

For more insights into the Ultralytics HUB, please explore our comprehensive HUB Docs:

Quickstart. Start training and deploying YOLO models with HUB effortlessly.
Datasets: Preparing and Uploading. Learn how to manage your datasets efficiently.
Projects: Creating and Managing. Organize your models into projects for better management.
Models: Training and Exporting. Understand the intricacies of training YOLO models and exporting them for deployment.
Integrations. Discover integration options for your models including TensorFlow, ONNX, and more.
Ultralytics HUB App. Leverage the Ultralytics App to run models on iOS and Android devices.
- iOS. Learn how to exploit YOLO models on iOS.
- Android. Explore capabilities on Android.
Inference API. Utilize our Inference API for cloud predictions with your models.

As this seems like a 🐛 Bug Report, we would appreciate it if you could provide a minimum reproducible example (MRE) to help us understand the issue better. This can include a detailed description along with code snippets or configuration files that showcase the problem.

On your specific concerns:

The expected behavior of pretrained models and whether the "parent" images should affect the "child" model is definitely something an Ultralytics engineer will look into for you.
The discrepancy between the Pretrained parameter settings in your workflow and the final model training page will also be addressed.

We aim to resolve all issues promptly and appreciate your patience. An Ultralytics engineer will follow up with you soon to provide further assistance. Thank you for your understanding! 😊

curtinmjc · 2024-11-07T20:17:10Z

As far as a minimum reproducible example (MRE) is concerned, the first model training used the Official YOLO11n classify architecture with my first custom dataset. The second model training started using the Custom model that I had trained initially with my second custom dataset. I can provide you with the two Classify datasets, but I do not believe there is anything special about them.

pderrenger · 2024-11-08T03:36:11Z

Hello @curtinmjc,

Thank you for providing additional context about your training process. It sounds like you're using a well-structured approach with the YOLO11n classify architecture and custom datasets. To address the issue you're experiencing, here are a few steps you can take:

Verify with Latest Versions: Ensure that you're using the latest version of the Ultralytics packages and the Ultralytics HUB. Updates often include bug fixes and improvements that might resolve your issue.
Check Model Configuration: Double-check the configuration settings for your second model training. Ensure that the pretrained model option is correctly set and that any changes in the configuration are saved before starting the training.
Review Training Logs: Examine the training logs for any discrepancies or warnings that might indicate why the pretrained setting is not being applied as expected.
Reproduce the Issue: If possible, try to reproduce the issue with a smaller subset of your data. This can help isolate the problem and make it easier to identify any specific causes.
Community Support: While I can't provide private support, I encourage you to share your findings and any additional questions on our GitHub Discussions or join our Discord community for further assistance from other users and the Ultralytics team.

Your feedback is invaluable, and we're here to help you get the most out of Ultralytics HUB. If you have any more details or questions, feel free to share them. 😊

curtinmjc · 2024-11-08T17:02:28Z

Thank you for your response. My answers to your listed steps:

Verify with Latest versions -- the training is happening in Google Colab. The versions are as follows: Ultralytics 8.3.28 🚀 Python-3.10.12 torch-2.5.0+cu121 CUDA:0 (NVIDIA A100-SXM4-40GB, 40514MiB)
Check Model Configuration -- I have screenshots of all sections of the web page Train a Model Step 2 of 3 and nothing was changed besides number of Epochs being reduced from 100 to 30 to save on training costs. (Note: I had the same issue with the behavior of pretrained models in prior days when I had Epochs set to the default 100.)
Review Training Logs -- I replicated the model training today to check the Google Colab Logs. There is nothing in the logs after "Kernel started..." and nothing unusual in Colab Output pane
Reproduce the Issue -- I already did this before opening the GitHub Issue yesterday using a smaller subset of the data. The datasets used only contain three classes each.

pderrenger · 2024-11-09T04:07:38Z

Hello @curtinmjc,

Thank you for the detailed follow-up! It sounds like you've been thorough in your troubleshooting process. Let's see how we can further assist you:

Version Check: Your setup with Ultralytics 8.3.28 and the latest CUDA and PyTorch versions looks good. It's always a good idea to ensure compatibility, and it seems you're up-to-date. 🚀
Model Configuration: Since you've verified the settings and the issue persists even with different epoch settings, it might be worth checking if there are any cached configurations or settings that could be affecting the training process. Sometimes clearing the cache or starting a fresh session can help.
Training Logs: The absence of detailed logs can be tricky. Ensure that logging is enabled in your Colab environment. You might want to add some print statements or logging commands in your training script to capture more detailed outputs. This can help identify if the pretrained model is being loaded correctly.
Reproduce the Issue: Given that you've already tried with a smaller dataset, it might be beneficial to test with a completely different dataset or a different model architecture to see if the issue persists. This can help determine if the problem is specific to your current setup or more general.

If the issue continues, consider sharing a minimal reproducible example with the community on GitHub Discussions or Discord. This can provide more insights and allow others to replicate and diagnose the problem.

Thank you for your patience and for working with the community to resolve this. If you have any more questions or need further assistance, feel free to reach out. 😊

curtinmjc added the question Further information is requested label Nov 6, 2024

UltralyticsAssistant added classify Image Classification issues, PR's HUB Ultralytics HUB issues labels Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model training based on pretrained classification model does not work as expected #916

Model training based on pretrained classification model does not work as expected #916

curtinmjc commented Nov 6, 2024

UltralyticsAssistant commented Nov 6, 2024

curtinmjc commented Nov 7, 2024

pderrenger commented Nov 8, 2024

curtinmjc commented Nov 8, 2024

pderrenger commented Nov 9, 2024

Model training based on pretrained classification model does not work as expected #916

Model training based on pretrained classification model does not work as expected #916

Comments

curtinmjc commented Nov 6, 2024

Search before asking

Question

Additional

UltralyticsAssistant commented Nov 6, 2024

curtinmjc commented Nov 7, 2024

pderrenger commented Nov 8, 2024

curtinmjc commented Nov 8, 2024

pderrenger commented Nov 9, 2024