Skip to content

Commit

Permalink
Fix model checkpoint saving issue when using PEFT, the is no check fo…
Browse files Browse the repository at this point in the history
…r whether the directory already exists resulting in error when using distributed training
  • Loading branch information
gioannides committed Oct 26, 2024
1 parent 6a6bb03 commit a14d760
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion optimum/neuron/utils/peft_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ def state_dict(self):

adapter_shards_dir_model = os.path.join(output_dir, "adapter_shards", "model")
if not os.path.isdir(adapter_shards_dir_model):
os.makedirs(adapter_shards_dir_model)
os.makedirs(adapter_shards_dir_model, exist_ok=True)

dummy_mod = DummyModule()
neuronx_distributed.trainer.save_checkpoint(
Expand Down

0 comments on commit a14d760

Please sign in to comment.