Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train_utils media token fix #283

Merged
merged 2 commits into from
Dec 2, 2023
Merged

train_utils media token fix #283

merged 2 commits into from
Dec 2, 2023

Conversation

olo126
Copy link
Collaborator

@olo126 olo126 commented Dec 2, 2023

added lines to train_util for helper function to unwrap model and also added media token id

@@ -77,8 +79,11 @@ def train_one_epoch(
batch_metadata_to_log[
f"{datasets[dataset_ix].name}_num_tokens"
] = attention_mask.sum().item()
model = unwrap_model(model)
model.media_token_id = 400
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think these should be here? Lines 82-84 are unnecessary I think

batch_metadata_to_log[f"{datasets[dataset_ix].name}_num_images"] = (
(input_ids == model.media_token_id).sum().item()
(input_ids == model.module.media_token_id).sum().item()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use the unwrap method here on the model rather than directly calling .module

@@ -10,6 +10,8 @@
from data_utils import DataInfo
import random
import numpy as np
from torch.nn.parallel import DistributedDataParallel as DDP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once lines 82-84 are removed this import will be unused and should also be removed

@anas-awadalla anas-awadalla merged commit eb6b8aa into mllm Dec 2, 2023
0 of 2 checks passed
@olo126 olo126 deleted the media_token_fix branch December 2, 2023 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants