-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input xxx to have 3 channels, but got xx channels instead #25
Comments
Here are my some attempts about this error: LLaVA-UHD/llava_uhd/train/llava-uhd/adapt_clip.py Lines 311 to 357 in 302301b
When adapt_CLIPVisionTower forward, the input So I change this as:
And here will not report errors. But after this, in adapt_llava.py, there seems another magic number LLaVA-UHD/llava_uhd/train/llava-uhd/adapt_llava.py Lines 201 to 204 in 302301b
I don't know what 8 and 4 means, but here will report another error:
and in this function, the cur_image_idx should be auto increment, but i cant see any Please check these code and reply my confusion, thanks! |
Hi @wnzhyee ! I've released another implementation of LLaVA-UHD here, which I believe is more stable and elegant. The code of the new repo originates from this repo, but its overall quality is improved, and the training program is tested to be able to normally run without bugs. When I reviewed this old repo and tried to fix this You are very welcome to use it, and I look forward to your feedback. |
Hi, I meet same problem when repetition pretraining step, it's happened when calculating patch_embedding in
LLaVA-UHD/llava_uhd/train/llava-uhd/adapt_clip.py
Line 79 in 302301b
It seems the similar problem has been raised about 2 months ago, is there any specific timeline to solve this question?
The text was updated successfully, but these errors were encountered: