-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FM-v4 branch into main #752
Merged
Changes from all commits
Commits
Show all changes
101 commits
Select commit
Hold shift + click to select a range
ae4add3
Update BalancedBatchSampler to use datasets' `data_sizes` method
nimashoghi 01fe2b4
Remove python 3.10 syntax
nimashoghi 2bf8213
Documentation
nimashoghi 7ba5b8a
Added set_epoch method
a367d1e
Format
46e3c57
Changed "resolved dataset" message to be a debug log to reduce log spam
87714f5
Minor changes to support multitask
abhshkdz 3105359
add in pickle data set; add in stat functions for combining mean and …
misko e170f53
checksums for equiformer
misko 3ea4dc4
detach compute metrics and add checksum function for linear layer
misko bbda257
Merge branch 'main' into fm-v2-pickle
misko 102667f
change name to dataset_configs
misko 2c571fd
add seed option
misko 319a597
remove pickle dataset
misko d1f2ccf
remove pickle dataset
misko 1e7548d
add experimental datatransform to ase_dataset
misko 845bce3
update with main
lbluque 86da069
clean up batchsampler and tests
lbluque ff628dd
base dataset class
lbluque 122197f
move lin_ref to base dataset
lbluque fb4ce16
inherit basedataset for ase dataset
lbluque c9e1759
filter indices prop
lbluque 85b8ab9
updated import for ase dataset
wood-b e227de3
Merge branch 'fm-v3' of github.com:Open-Catalyst-Project/ocp into fm-v3
wood-b 95d3e6f
added create_dataset fn
wood-b b6c640e
yaml load fix
lbluque 7fa1904
create dataset function instead of filtering in base
lbluque 04c96bf
remove filtered_indices
lbluque ea35b57
make create_dataset and LMDBDatabase importable from datasets
lbluque dc98285
create_dataset cleanup
lbluque 2339916
test create_dataset
lbluque 9b58cc7
use metadata.natoms directly and add it to subset
lbluque 63c03fc
use self.indices to handle shard
lbluque 76322aa
rename _data_sizes
lbluque 0e7e4a8
merge with main-legacy
lbluque bb41b13
merge with main-legacy + no more data_sizes
lbluque b4e22bc
fix Subset of metadata
lbluque a6cc2c2
fix up to be mergeable
misko 7033d10
merge in monorepo
misko 899a227
small fix for import and keyerror
misko 29b6e68
minor change to metadata, added full path option
wood-b f9b15cd
Merge branch 'main' into balanced-batch-sampler+base-dataset
wood-b dc59f96
import updates
wood-b 505cc24
Merge branch 'balanced-batch-sampler+base-dataset' into fm-v4
wood-b 44234b7
minor fix to base dataset
wood-b 63348fd
skip force_balance and seed
misko 45a2b4a
adding get_metadata to base_dataset
wood-b 64b8df2
implement get_metadata for datasets; add tests for max_atoms and bala…
misko fec7fc7
merge in basedataset branch
misko 80fea27
a[:len(a)+1] does not throw error, change to check for this
misko 80c8e6b
Merge branch 'balanced-batch-sampler+base-dataset' into fm-v4
misko f4910bc
bug fix for base_dataset
wood-b e93e73f
max atoms branch
misko 883a15f
fix typo
misko e58a53a
Merge branch 'max_atoms' into fm-v4
misko 9b87082
do pbc per system
misko d8cf857
add option to use single system pbc
misko 061abf9
add multiple mapping
misko 0b3f9fe
merge
misko 5277b4f
Merge branch 'fm-v4-add-multiple-mapping' into fm-v4
misko 18c15f8
lint and github workflow fixes
misko fb24889
track parent checkpoint for logger grouping
mshuaibii 57a2eaf
add generator to basedataset
misko 870fd22
Merge branch 'fm-v4' of github.com:Open-Catalyst-Project/ocp into fm-v4
misko e50120d
check path relative to yaml file
misko 2e557ad
add load and exit flag to base_trainer
misko 7ef4aec
add in merge mean and std code to utils
misko 20e62b5
add log when passing through mean or computing; check other paths for…
misko 87869b6
add qos flag
misko 0850f34
use slurm_qos instead of qos
misko 75b7e9e
fix includes
misko 49dfca7
fix set init
misko 94f6ce1
merge main
rayg1234 e10575c
Merge remote-tracking branch 'origin/main' into fm-v4
rayg1234 5743a59
adding new notebook for using fairchem models with NEBs without CatTS…
brookwander 4880d0c
merge main
rayg1234 692147d
Merge branch 'main' into fm-v4
mshuaibii 881890e
Merge remote-tracking branch 'origin/main' into fm-v4
rayg1234 f284190
Merge branch 'main' into fm-v4
misko ed0e936
merge
misko 25327b0
merge main
misko 089de08
Merge branch 'balanced-batch-sampler+base-dataset' into fm-v4
misko 8be3c78
remove files with diff whitespace
misko 14a073b
Merge branch 'main' into fm-v4
misko 7a71c46
add resolution flag to escn
misko 2aca348
Merge branch 'add_resolution_flag_to_escn' into fm-v4
misko 371eb31
try to revert oxides
misko a23434c
revert typing
misko f11ac5e
remove white space
misko 8951360
extra line never reached
misko 67229dc
move out of fmv4 into dev
misko b031719
Merge branch 'main' into fm-v4
misko fc269b8
move avg num nodes
misko 039f9e6
Merge branch 'main' into fm-v4
rayg1234 3ec098c
Merge branch 'main' into fm-v4
misko 21eecd4
Merge remote-tracking branch 'origin' into fm-v4
rayg1234 f3e1c38
optional import from experimental
misko f2302bf
fix lint
misko 69648fb
add comments, refactor common trainer args in a single dictionary
misko 0b4c5ee
add comments, refactor common trainer args in a single dictionary
misko 07efac0
remove parent
misko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,12 @@ | |
|
||
if TYPE_CHECKING: | ||
from torch_geometric.data import Data | ||
from contextlib import suppress | ||
|
||
with suppress(ImportError): | ||
# TODO remove this in favor of a better solution | ||
# We should never be importing * from a module | ||
from fairchem.experimental.foundation_models.multi_task_dataloader.transforms.data_object import * # noqa | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. comment |
||
|
||
|
||
class DataTransforms: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
import logging | ||
import os | ||
import random | ||
import sys | ||
from abc import ABC, abstractmethod | ||
from itertools import chain | ||
from typing import TYPE_CHECKING | ||
|
@@ -232,6 +233,8 @@ def load(self) -> None: | |
self.load_loss() | ||
self.load_optimizer() | ||
self.load_extras() | ||
if self.config["optim"].get("load_datasets_and_model_then_exit", False): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is this needed if this is the last line of the init anyways? |
||
sys.exit(0) | ||
|
||
def set_seed(self, seed) -> None: | ||
# https://pytorch.org/docs/stable/notes/randomness.html | ||
|
@@ -792,6 +795,22 @@ def update_best( | |
disable_tqdm=disable_eval_tqdm, | ||
) | ||
|
||
def _aggregate_metrics(self, metrics): | ||
aggregated_metrics = {} | ||
for k in metrics: | ||
aggregated_metrics[k] = { | ||
"total": distutils.all_reduce( | ||
metrics[k]["total"], average=False, device=self.device | ||
), | ||
"numel": distutils.all_reduce( | ||
metrics[k]["numel"], average=False, device=self.device | ||
), | ||
} | ||
aggregated_metrics[k]["metric"] = ( | ||
aggregated_metrics[k]["total"] / aggregated_metrics[k]["numel"] | ||
) | ||
return aggregated_metrics | ||
|
||
@torch.no_grad() | ||
def validate(self, split: str = "val", disable_tqdm: bool = False): | ||
ensure_fitted(self._unwrapped_model, warn=True) | ||
|
@@ -833,20 +852,7 @@ def validate(self, split: str = "val", disable_tqdm: bool = False): | |
metrics = self._compute_metrics(out, batch, evaluator, metrics) | ||
metrics = evaluator.update("loss", loss.item(), metrics) | ||
|
||
aggregated_metrics = {} | ||
for k in metrics: | ||
aggregated_metrics[k] = { | ||
"total": distutils.all_reduce( | ||
metrics[k]["total"], average=False, device=self.device | ||
), | ||
"numel": distutils.all_reduce( | ||
metrics[k]["numel"], average=False, device=self.device | ||
), | ||
} | ||
aggregated_metrics[k]["metric"] = ( | ||
aggregated_metrics[k]["total"] / aggregated_metrics[k]["numel"] | ||
) | ||
metrics = aggregated_metrics | ||
metrics = self._aggregate_metrics(metrics) | ||
|
||
log_dict = {k: metrics[k]["metric"] for k in metrics} | ||
log_dict.update({"epoch": self.epoch}) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment on what this does