Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create extra samples with surplus images #272

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

isaac-chung
Copy link
Contributor

@isaac-chung isaac-chung commented Oct 7, 2023

Addresses issue #231

  • chunk and yield a sample every max_num_images valid images
  • refactor preprocess_gpt_interleaved and preprocess_interleaved

@isaac-chung
Copy link
Contributor Author

Few questions before I make more changes:

  • What is a good way to test this?
  • I've noticed some subtle differences between the image/text processing steps between preprocess_gpt_interleaved and preprocess_interleaved, just wondering if these could be consolidated. e.g.
    • the gpt one pads with shapes (3, 224, 224) while mmc4 depends on the image size.
    • mmc4 has 50% chance of keeping single image samples while gpt does not.
    • mmc4 avoid the situation where there's one token and it's at the end while gpt does not.

@anas-awadalla
Copy link
Collaborator

Yeah I think we can default to the mmc4 code for all of these.

For the first point this should be based on the image size. I think because most vision encoders have size 224x224 we just defaulted to that but that isn't the right way to do it.

@isaac-chung isaac-chung marked this pull request as ready for review October 8, 2023 13:24
@isaac-chung
Copy link
Contributor Author

@anas-awadalla here's a first complete draft. please let me know what you think.
Separately I think a pre-commit hook would benefit development. Could raise this in a separate PR if needed.

@anas-awadalla
Copy link
Collaborator

Great will check it tomorrow

@anas-awadalla anas-awadalla self-requested a review October 9, 2023 18:24
@anas-awadalla
Copy link
Collaborator

What would the hook contain? The code formatting?

@isaac-chung
Copy link
Contributor Author

What would the hook contain? The code formatting?

Yep, kind of like this or whatever else we want.

@anas-awadalla
Copy link
Collaborator

What would the hook contain? The code formatting?

Yep, kind of like this or whatever else we want.

This would be awesome!

@isaac-chung
Copy link
Contributor Author

@anas-awadalla hoping to get your feedback on this when you get a chance next 🙏

@anas-awadalla
Copy link
Collaborator

Sorry @isaac-chung got busy with a paper I am pushing. I will definitely review and merge this week tho!

@isaac-chung
Copy link
Contributor Author

No worries, good luck with the paper!

@isaac-chung
Copy link
Contributor Author

@anas-awadalla a gentle nudge to bubble this back up in your inbox. Hoping to close this soon 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants