Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add options to precompute the epoch #569

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Add options to precompute the epoch #569

wants to merge 17 commits into from

Conversation

knighton
Copy link
Contributor

Add the option to pre-generate the epoch. This should save us a lot of time when there is a lot of work happening between creating the StreamingDataset and iterating it.

Pre-generating can happen concurrently with the last third of init and beyond by providing which epoch and sample offset to generate (init_pregen_epoch init_pregen_sample). Note that this is before any load_state_dict() so if there is going to be a resumption happening to not 0:0, we won't know it at that time, although the user might. Also, we can't just yolo all the epochs at once because of RAM/scale concerns. Finally, we need to be provided DataLoader num_workers for this to work, as we won't otherwise know it in a rank process without resorting to the garbage collector trampoline.

Pre-generating can happen on the fly as well, more easily so, by setting the bool arg pregen_next_epoch, which simply pre-generates epoch + 1:0 in the background when done generating (or loading pre-generated) the current epoch.

Details are managed by pregen_epoch_timeout (defaults to 12 min) and pregen_epoch_tick (defaults to 0xCAFE / 1337 / 42, or just under a second).

                 init_pregen_epoch: Optional[int] = None,
                 init_pregen_sample: Optional[int] = None,
                 pregen_next_epoch: bool = True,
                 pregen_epoch_timeout: Optional[float] = float(np.arange(1, 7).prod()),
                 pregen_epoch_tick: float = 0xCAFE / 1337 / 42,
                 num_workers: Optional[int] = None,

    def _push_back_pregen_epoch_todo(self, todo_filename: str, epoch: int, sample: int) -> None:
    def _pop_front_pregen_epoch_todo(self, todo_filename: str) -> Tuple[int, int, int]:
    def _request_pregen_epoch(self, epoch: int, sample: int) -> None:
    def _each_pregen_epoch_todo(self) -> Iterator[Tuple[int, int]]:
    def _pregen_epoch_loop(self) -> None:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant