-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge upstream #64
merge upstream #64
Commits on Aug 12, 2023
-
Fixed final value of cosine decay lr (#1011)
* Fixed final value of cosine decay lr * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for eda5aae - Browse repository at this point
Copy the full SHA eda5aaeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 01b5e22 - Browse repository at this point
Copy the full SHA 01b5e22View commit details
Commits on Aug 22, 2023
-
Configuration menu - View commit details
-
Copy full SHA for d8bcd97 - Browse repository at this point
Copy the full SHA d8bcd97View commit details
Commits on Aug 28, 2023
-
* Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 43ea51c - Browse repository at this point
Copy the full SHA 43ea51cView commit details
Commits on Sep 13, 2023
-
Bump transformers version and update enwik8 link (#1024)
* Update transformers version Signed-off-by: Dashiell Stander <[email protected]> * Update the enwik8 URL to the one HF uses, the old one is down. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2922bef - Browse repository at this point
Copy the full SHA 2922befView commit details
Commits on Sep 15, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 960ed3d - Browse repository at this point
Copy the full SHA 960ed3dView commit details -
* Update README.md Fix broken link * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 97e376c - Browse repository at this point
Copy the full SHA 97e376cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7821aa7 - Browse repository at this point
Copy the full SHA 7821aa7View commit details -
Configuration menu - View commit details
-
Copy full SHA for cdc94ee - Browse repository at this point
Copy the full SHA cdc94eeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 737c913 - Browse repository at this point
Copy the full SHA 737c913View commit details -
Configuration menu - View commit details
-
Copy full SHA for c883e8c - Browse repository at this point
Copy the full SHA c883e8cView commit details
Commits on Sep 18, 2023
-
Fix bf16 for zero > 0 and pipeline parallelism > 0 (#1032)
* Fix bugs so we can use bf16 with zero > 0 Signed-off-by: Dashiell Stander <[email protected]> * Typo Signed-off-by: Dashiell Stander <[email protected]> * Typo Signed-off-by: Dashiell Stander <[email protected]> * With the DeepSpeed updates there may be no need to do grad_accum in fp32 Signed-off-by: Dashiell Stander <[email protected]> * Add warning about necessity of fp32 grad_accum with bf16, pp>0, and zero1 Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d9166bf - Browse repository at this point
Copy the full SHA d9166bfView commit details -
Remove support for lazy dataset implementation (#1033)
* Remove lazy dataset implementation option Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for fcd5f92 - Browse repository at this point
Copy the full SHA fcd5f92View commit details -
Fix SequentialWrapper Generation (pipe_parallel_size = 0) (#1031)
* Fix SequentialGeneration * Fix SequentialGeneration
Configuration menu - View commit details
-
Copy full SHA for 70af6e8 - Browse repository at this point
Copy the full SHA 70af6e8View commit details
Commits on Sep 20, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 8903a96 - Browse repository at this point
Copy the full SHA 8903a96View commit details -
Fix register_buffer parameter (#1036)
* Fix register_buffer parameter * Fix register_buffer parameter
Configuration menu - View commit details
-
Copy full SHA for 0ce77ab - Browse repository at this point
Copy the full SHA 0ce77abView commit details -
Add flash 2.x message to README.md (#1037)
* Add flash 2.x message to README.md * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 444c0ef - Browse repository at this point
Copy the full SHA 444c0efView commit details
Commits on Sep 23, 2023
-
Add s3 checkpoint syncing (#1010)
* add s3 checkpoint syncing * Update NeoXArgs docs automatically * remove CPCargo requirement * Update NeoXArgs docs automatically * Make s3 imports try-except and separate requirements to s3 file * Update NeoXArgs docs automatically * Announce feature * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f9503b7 - Browse repository at this point
Copy the full SHA f9503b7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 390d37c - Browse repository at this point
Copy the full SHA 390d37cView commit details -
Configuration menu - View commit details
-
Copy full SHA for e431ff5 - Browse repository at this point
Copy the full SHA e431ff5View commit details
Commits on Sep 25, 2023
-
Remove the NeoX implementation of GPT2Tokenizer (#1042)
* Try out just using the HF implementation Signed-off-by: Dashiell Stander <[email protected]> * Rely solely on HF tokenizer. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2ab05be - Browse repository at this point
Copy the full SHA 2ab05beView commit details -
Pre-compute RoPE embeddings in fp32 (#1041)
* Pre-commit Signed-off-by: Dashiell Stander <[email protected]> * Sequence dimension is 0 Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3bfedf4 - Browse repository at this point
Copy the full SHA 3bfedf4View commit details
Commits on Sep 27, 2023
-
Patch LR Annealing Bug (#1046)
* Ensure that LR annealing is correct even after loading from checkpoint. Patch from Eric Nguyen Co-authored-by: Eric Nguyen <[email protected]> Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Test whether we need the whole patch Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Turns out we do not need the entire patch, just one line Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: Eric Nguyen <[email protected]> Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ba51ca0 - Browse repository at this point
Copy the full SHA ba51ca0View commit details -
Improve FLOPS Calculation (#1044)
* Use Megatron-DeepSpeed flops calculation Signed-off-by: Dashiell Stander <[email protected]> * Use Megatron-DeepSpeed flops calculation Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Direct comparison of FLOPS calculations Signed-off-by: Dashiell Stander <[email protected]> * Remove test logging Signed-off-by: Dashiell Stander <[email protected]> --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5f36401 - Browse repository at this point
Copy the full SHA 5f36401View commit details
Commits on Sep 28, 2023
-
adding boilerplate coverity scan to submit to public analysis (#1047)
* adding boilerplate coverity scan to submit to public analysis * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5fa85ad - Browse repository at this point
Copy the full SHA 5fa85adView commit details
Commits on Sep 29, 2023
-
Add section to the README detailing how to start distributed jobs (#1048
) * Add documentation about kicking off distributed jobs Signed-off-by: Dashiell Stander <[email protected]> * Add documentation about kicking off distributed jobs Signed-off-by: Dashiell Stander <[email protected]> * Add documentation about kicking off distributed jobs Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Added more info on run command modification and cleaned up a bit * slight cleanup * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f44db66 - Browse repository at this point
Copy the full SHA f44db66View commit details -
* Fix readme typo * Update NeoXArgs docs automatically * More typos * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2c60645 - Browse repository at this point
Copy the full SHA 2c60645View commit details -
Configuration menu - View commit details
-
Copy full SHA for b14d6f7 - Browse repository at this point
Copy the full SHA b14d6f7View commit details -
* Update CITATION.cff * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 93cac79 - Browse repository at this point
Copy the full SHA 93cac79View commit details
Commits on Oct 1, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 7a8569f - Browse repository at this point
Copy the full SHA 7a8569fView commit details
Commits on Oct 2, 2023
-
Organize the
tools
directory (#1055)* Re-organize the folder Co-authored-by: Stella Biderman <[email protected]> Signed-off-by: Dashiell Stander <[email protected]> * Add README.md files for each subdirectory. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Clarify the difference between HF scripts Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Fix tools paths * Update NeoXArgs docs automatically * flesh out ckpts README * Update NeoXArgs docs automatically * Fix tools paths for megatron imports * Update NeoXArgs docs automatically * Delete tools/ckpts/merge_mp_partitions.py since it's based on a very old Megatron * Update NeoXArgs docs automatically * Add blurb to bash tools README * Update NeoXArgs docs automatically * Flesh out datasets README * Update NeoXArgs docs automatically * formatting * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3f43f07 - Browse repository at this point
Copy the full SHA 3f43f07View commit details
Commits on Oct 4, 2023
-
Add documentation about using labelled datasets (#1056)
* Add documentation and an informative error Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f6ac04d - Browse repository at this point
Copy the full SHA f6ac04dView commit details
Commits on Oct 17, 2023
-
LR scheduler fix no longer breaks inference (#1060)
* Add lr_scheduler check for inference. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e001a04 - Browse repository at this point
Copy the full SHA e001a04View commit details
Commits on Oct 20, 2023
-
* initial commit * test set, fixed readme and docstring * Refactor Lion implementation --------- Co-authored-by: kamathis4 <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b02d989 - Browse repository at this point
Copy the full SHA b02d989View commit details
Commits on Oct 31, 2023
-
fix lion optimizer documentation (#1067)
* fix lion optimizer documentation * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e277bc7 - Browse repository at this point
Copy the full SHA e277bc7View commit details -
Fix preprocess_data.py link (#1064)
* Fix preprocess_data.py link * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f574f22 - Browse repository at this point
Copy the full SHA f574f22View commit details
Commits on Nov 1, 2023
-
Edge-casing for multi-GPU HF-to-NeoX conversion (#1065)
* edge-casing for multiGPU hf to sequential case * cleanup whitespace * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for fcc5af5 - Browse repository at this point
Copy the full SHA fcc5af5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8c9fc00 - Browse repository at this point
Copy the full SHA 8c9fc00View commit details -
Pin version of
lm_eval
(#1070)* Pin lm_eval version * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a10f69c - Browse repository at this point
Copy the full SHA a10f69cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 41f019e - Browse repository at this point
Copy the full SHA 41f019eView commit details
Commits on Nov 5, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 90aa131 - Browse repository at this point
Copy the full SHA 90aa131View commit details
Commits on Nov 7, 2023
-
When processing mlp.dense_4h_to_h.bias and attention.dense.bias, tp_r…
…anks are not reflected, so strange results always appear when tp_ranks is greater than 1.
Configuration menu - View commit details
-
Copy full SHA for 04dc2ba - Browse repository at this point
Copy the full SHA 04dc2baView commit details -
Merge pull request #1072 from kyuheejang/Fixing-neox-to-huggingface
�Fixing convert neox to huggingface bug
Configuration menu - View commit details
-
Copy full SHA for f214358 - Browse repository at this point
Copy the full SHA f214358View commit details
Commits on Nov 8, 2023
-
Configuration menu - View commit details
-
Copy full SHA for d8028f8 - Browse repository at this point
Copy the full SHA d8028f8View commit details
Commits on Nov 16, 2023
-
* Update neox_args.py These attention configuration options were missing from the docs. This will fix that. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 10bf788 - Browse repository at this point
Copy the full SHA 10bf788View commit details
Commits on Nov 22, 2023
-
* Update README.md * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f48d3a6 - Browse repository at this point
Copy the full SHA f48d3a6View commit details
Commits on Nov 30, 2023
-
Configuration menu - View commit details
-
Copy full SHA for efea81f - Browse repository at this point
Copy the full SHA efea81fView commit details
Commits on Dec 4, 2023
-
* Use `.yml` extensions in README to reflect extensions used in `configs/` folder * Rename `save_interval` -> `checkpoint_factor` * Mark expected failures in existing tests * Fix minor typos * Allow creation of checkpoint at iteration 0 when `do_train=False` Helpful for unit tests because it allows use of a randomly initialised model * Delete duplicated `test_fused_kernels.py` Primary version lives in `tests/model/test_fused_kernels.py` * Avoid initializing CUDA whenever `megatron` is imported Resolves `Cannot re-initialize CUDA in forked subprocess` error when running distributed unit tests * Extend suite of unit tests
Configuration menu - View commit details
-
Copy full SHA for 3be59a4 - Browse repository at this point
Copy the full SHA 3be59a4View commit details -
* Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml update build command to avert empty cwd in build metrics * Update coverity_scan.yml * Update coverity_scan.yml adding verbose to debug curl * Update coverity_scan.yml debug print trace to examine build metrics xml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a2b2020 - Browse repository at this point
Copy the full SHA a2b2020View commit details
Commits on Dec 6, 2023
-
Corrects FLOPs formula as per 1093 (#1094)
* Update logging.py * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 050f560 - Browse repository at this point
Copy the full SHA 050f560View commit details
Commits on Dec 19, 2023
-
Remove myself as a code owner as I shouldn't be approving PRs.
Configuration menu - View commit details
-
Copy full SHA for f19b2ec - Browse repository at this point
Copy the full SHA f19b2ecView commit details
Commits on Dec 20, 2023
-
Bump transformers from 4.30.2 to 4.36.0 in /requirements (#1097)
Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.2 to 4.36.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.30.2...v4.36.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 07166da - Browse repository at this point
Copy the full SHA 07166daView commit details -
Pins old DeeperSpeed until bug is fixed (#1095)
* Pins old DeeperSpeed until bug is fixed There is a bug in upstream DeepSpeed detailed [here](microsoft/DeepSpeed#4781) that we didn't catch before synching with main. This pins the prior commit so the bug doesn't impact users. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9283eff - Browse repository at this point
Copy the full SHA 9283effView commit details
Commits on Dec 22, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 9eef954 - Browse repository at this point
Copy the full SHA 9eef954View commit details -
Configuration menu - View commit details
-
Copy full SHA for a48e09e - Browse repository at this point
Copy the full SHA a48e09eView commit details -
Update NeoXArgs docs automatically
github-actions committedDec 22, 2023 Configuration menu - View commit details
-
Copy full SHA for 613e5a6 - Browse repository at this point
Copy the full SHA 613e5a6View commit details -
Configuration menu - View commit details
-
Copy full SHA for be7eeda - Browse repository at this point
Copy the full SHA be7eedaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2117afc - Browse repository at this point
Copy the full SHA 2117afcView commit details -
Update NeoXArgs docs automatically
github-actions committedDec 22, 2023 Configuration menu - View commit details
-
Copy full SHA for 8dba5b6 - Browse repository at this point
Copy the full SHA 8dba5b6View commit details -
* add qk normalization * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f161245 - Browse repository at this point
Copy the full SHA f161245View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7fb3b3c - Browse repository at this point
Copy the full SHA 7fb3b3cView commit details -
Configuration menu - View commit details
-
Copy full SHA for a7509f0 - Browse repository at this point
Copy the full SHA a7509f0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8eaac4e - Browse repository at this point
Copy the full SHA 8eaac4eView commit details -
Update NeoXArgs docs automatically
github-actions committedDec 22, 2023 Configuration menu - View commit details
-
Copy full SHA for 4d5a811 - Browse repository at this point
Copy the full SHA 4d5a811View commit details -
Merge pull request #1099 from EleutherAI/StellaAthena-patch-4-1
Update README.md
Configuration menu - View commit details
-
Copy full SHA for 05cc29c - Browse repository at this point
Copy the full SHA 05cc29cView commit details -
Configuration menu - View commit details
-
Copy full SHA for e25446e - Browse repository at this point
Copy the full SHA e25446eView commit details -
Merge pull request #1102 from EleutherAI/StellaAthena-patch-4
More readme updates
Configuration menu - View commit details
-
Copy full SHA for 287f9f7 - Browse repository at this point
Copy the full SHA 287f9f7View commit details
Commits on Dec 23, 2023
-
* add lm-eval v0.4.0 * rename evaluate.py to avoid shadowing HF evaluate library * document new evaluate.py filename * Update NeoXArgs docs automatically * handle results format differently * Update NeoXArgs docs automatically * Update hanging evaluate.py scripts * Update NeoXArgs docs automatically * Add triviaqa to default eval_tasks * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b27e409 - Browse repository at this point
Copy the full SHA b27e409View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1148a0f - Browse repository at this point
Copy the full SHA 1148a0fView commit details
Commits on Dec 26, 2023
-
* Update neox_args.py Changed some default values to correspond to values that we generally recommend people use. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e5a7ea7 - Browse repository at this point
Copy the full SHA e5a7ea7View commit details
Commits on Jan 4, 2024
-
* Fix syntax errors * Make pre-commit fixes across repo * Ensure correct version of clang-format in CI --------- Co-authored-by: Yang Zhang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for eca6b1a - Browse repository at this point
Copy the full SHA eca6b1aView commit details -
Fix install, Dockerfile, CI (#1104)
* Add missing jinja2 dep Missing transitive dep of lm_eval * Fix Dockerfile Only devel has nvcc, needed to build packages And don't rebuild fused kernels if no relevant change * Ensure Dockerfile builds in CI Also ensures that install actually works --------- Co-authored-by: Yang Zhang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 98716eb - Browse repository at this point
Copy the full SHA 98716ebView commit details
Commits on Jan 5, 2024
-
Fused Rotary Embeddings (fixed) (#1108)
* Create fused_rotary_positional_embedding.cpp * Create fused_rotary_positional_embedding.h * Create fused_rotary_positional_embedding_cuda.cu * Update fused_rotary_positional_embedding.h Ports the fix from NVIDIA/apex#1750 into this branch. * Update neox_args.py * Update setup.py * Update initialize.py * Update setup.py * Update __init__.py * Update test_fused_kernels.py * Update setup.py * Create fused_rope.py * Update fused_rotary_positional_embedding.h * Update fused_rotary_positional_embedding.cpp * Update fused_rotary_positional_embedding.cpp * Update transformer.py * Update transformer.py Just checked and this should work for bf16. Or, at least, the reason I originally thought it wouldn't doesn't apply. * Update transformer.py * Create 125M_fused_rope.yml * Update 125M_fused_rope.yml * Update transformer.py Add `self.rope_fusion = neox_args.rope_fusion` so that `ParallelSelfAttention` knows if we're using rope fusion. * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Fix fused rope Just needed to bring in the latest headers/sources, and call into it the right way from transformers.py. * Add rope_fusion arg to all ymls --------- Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]> Co-authored-by: Yang Zhang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 77605ca - Browse repository at this point
Copy the full SHA 77605caView commit details -
Add pythia 14M and 31M configs (#1111)
* Add pythia 14M config * Create 31M.yml
Configuration menu - View commit details
-
Copy full SHA for f14782a - Browse repository at this point
Copy the full SHA f14782aView commit details
Commits on Jan 9, 2024
-
Add docker compose and change containerized setup instructions to use…
… it (#1113) * Add pythia 14M config * Create 31M.yml * Add docker compose, update readme docker instructions to utilize it * Add logging limits to docker-compose files * Change data mount from /gpt-neox/data to /data/ This prevents possible errors if the user already has a /data/ directory in their /gpt-neox/ folder * Update README.md Makes the code blocks into blocks in the changed parts * Make the docker-compose spinup tidier * Avoid config bloat by only providing the updated paths * Apply precommit --------- Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e6e944a - Browse repository at this point
Copy the full SHA e6e944aView commit details
Commits on Jan 11, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 92b1b6f - Browse repository at this point
Copy the full SHA 92b1b6fView commit details
Commits on Jan 13, 2024
-
Bump jinja2 from 3.1.2 to 3.1.3 in /requirements (#1120)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](pallets/jinja@3.1.2...3.1.3) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 90f70ff - Browse repository at this point
Copy the full SHA 90f70ffView commit details
Commits on Jan 19, 2024
-
Enable passing of
--account
tosrun
/ SlurmLauncher (#1126)* add `account` to Deepspeed args * Add handling of `account` when `deepspeed_slurm` is set * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6399155 - Browse repository at this point
Copy the full SHA 6399155View commit details
Commits on Jan 24, 2024
-
* update copyrights * Update NeoXArgs docs automatically * nvidia copyright years * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7a8fa2f - Browse repository at this point
Copy the full SHA 7a8fa2fView commit details
Commits on Jan 26, 2024
-
* Add simple util for CUDA timings * Add fused layernorm kernel from Megatron Closes #952 * change default fused layernorm to false * Update test_setup.yml * Update test_train_base.yml --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: jahatef <[email protected]> Co-authored-by: Jacob Hatef <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3d8fec0 - Browse repository at this point
Copy the full SHA 3d8fec0View commit details
Commits on Jan 29, 2024
-
* contributing guide * Update NeoXArgs docs automatically * Update CONTRIBUTING.md * Update NeoXArgs docs automatically * Remove microsoft references and link on main readme * Update NeoXArgs docs automatically * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e5602c3 - Browse repository at this point
Copy the full SHA e5602c3View commit details
Commits on Jan 30, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1c133bf - Browse repository at this point
Copy the full SHA 1c133bfView commit details
Commits on Feb 1, 2024
-
Update lm_eval v0.4 to PyPI dependencies (#1141)
* Update requirements.txt * Update requirements.txt * Update NeoXArgs docs automatically * add note to neox_args.py * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 032ec8c - Browse repository at this point
Copy the full SHA 032ec8cView commit details
Commits on Feb 5, 2024
-
* Remove 'gas' configuration variable * Remove gas from configs and config documentation * Update training.py
Configuration menu - View commit details
-
Copy full SHA for 91c44bc - Browse repository at this point
Copy the full SHA 91c44bcView commit details
Commits on Feb 8, 2024
-
Improve Conversion Utilities (#1124)
* draft: unify sequential + PPModule conversion scripts * Update NeoXArgs docs automatically * draft: pull out model param names / model definition * Update NeoXArgs docs automatically * tested: neox models with TP = 1, PipelineModule, work * Update NeoXArgs docs automatically * draft: Llama + GQA QKV resharding * Update NeoXArgs docs automatically * update Llama conversion script to support Mistral and GQA * Update NeoXArgs docs automatically * test Mistral-7B conversion * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * push documentation on imports / Llama loading * push further readme updates (Mistral included) * Preventconversions for unsupported featurees, disclaim in README * Update NeoXArgs docs automatically * revert PR#1072 RowParallel bias conversion error * remove sequential_to_hf and module_to_hf scripts, deprecated in favor of convert_neox_to_hf.py * Update NeoXArgs docs automatically * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f7373f8 - Browse repository at this point
Copy the full SHA f7373f8View commit details
Commits on Feb 21, 2024
-
Fixes distributed tests, and skips tests that are broken. (#1149)
* Fixes distributed tests, and skips tests that are broken. * Update NeoXArgs docs automatically * improve pytest msgs and remove commented code * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 412cf6e - Browse repository at this point
Copy the full SHA 412cf6eView commit details -
* Fixes distributed tests, and skips tests that are broken. * memory profiling for gpt-neox. Only works for pp=0, pp=1+ needs DS commits. * Update NeoXArgs docs automatically * adds memory profiling for pipeline parallel * Update NeoXArgs docs automatically * fix spacing * Update NeoXArgs docs automatically * fix spacing again * Update NeoXArgs docs automatically * get rid of unwanted changes * Update NeoXArgs docs automatically * get rid of file * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * add nsight systems support * remove tests changes again * Update NeoXArgs docs automatically * add tests * Update NeoXArgs docs automatically * Update training.py * Update NeoXArgs docs automatically * Add assertion message * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 46d179c - Browse repository at this point
Copy the full SHA 46d179cView commit details
Commits on Feb 23, 2024
-
add profiling to readme (#1154)
* add profiling to readme * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for eee03b2 - Browse repository at this point
Copy the full SHA eee03b2View commit details -
* Switch default command for docker image * Rename pythia paths docker file for clarity * Update docker build to use python 3.10 * Update github workflows to use ubuntu 22.04 and python 3.10 * Bump pytorch library patch versions * Add pytest-html for reasonably formatted test reports * Fix build after torch and cuda version bump * Fix apex install for newer version 1) This, empirically, works, as tested by running the build and kicking off training. 2) Apex documentation says it is incorrect syntax and deprecated. 3) It takes so long to compile that it is probably, all by itself, something that needs fixing. 4) I will probably pull the fused adamw out of apex. 5) It has been building for twenty minutes so I am going to go do something else. * Fix pip version to ensure apex compilation remains good * Fix unit test for evaluate * Fix pip requirement Prevents possible build issues with apex especially across divergent pip versions * Update dockerfile to point to stripped-down apex repo * Revert "Update dockerfile to point to stripped-down apex repo" This reverts commit 40c7656. * Update apex version in dockerfile * Switch to downloading prebuilt apex wheel * Clean up docker copy commands * Have docker build conditionally get binaries or build apex * Apply precommit
Configuration menu - View commit details
-
Copy full SHA for a7638a8 - Browse repository at this point
Copy the full SHA a7638a8View commit details -
* Switch default command for docker image * Rename pythia paths docker file for clarity * Fix unit test for evaluate * Update readme for testing to omit --forked argument * Add pytest-html to requirements-dev.txt * Revert "Update readme for testing to omit --forked argument" This reverts commit 19021fc. * Add data/ directory and .bin and .idx files in /tests/data to .gitignore This makes it so that git doesn't try to let you commit (or force you to stash) data files * Make .gitignore for data files slightly more elegant * Add utility script for doing token counts on processed datasets * Run precommit hook * Fix token count script, run precommit
Configuration menu - View commit details
-
Copy full SHA for 72d1803 - Browse repository at this point
Copy the full SHA 72d1803View commit details -
Draft PR Adding mistral 0.1 (#1131)
* add support for flash attention 2 * change cosine decay to chinchilla style * set default warmup to none so that warmup_iters can be set * fixed bug * fixed chinchilla lr * add s3 checkpoint syncing * rotary embedding in fp32 * fix for seq_len < max_seq_len * some fixes, still not working * ?' : * fix bugs; evaluate on step 0 * first attempt at gqa * gqa works in kv_heads==query_heads case * gqa working * workaround for FSX quota * update with llemma * update with recent PR * README and requirements updated * Added Mistral config * Added sliding window through flash attention 2 * Added sliding window * Mistral should likely use mp=2 like llama2 * Update gitignore * Removed unused CPCargo import * Conversion script (WIP) * Fixed missing slurm environ vars * updated mistral config * updated job script * initial commit conversion mistral hf to sequential * Added stacking q, k, v appropriately for mp ranks * pp=0 support from end of 2023 * Cleaning up config and removing Autoconfig in conversion script * Cleaned up conversion example script * cleanup: add back configs folder, discard Llemma readme * cleanup: remove llemma lr sched changes, re-add requirements/ folder * docs: add explanation of intermediate_size behavior * args: add argument checking for num_kv_heads, clean up usage syntax * args: prevent num KV heads < TP worldsize * readd triton flash attn func * cleanup: use tools/ dir from main * docs: re-add mistral , GQA as supported * cleanup: delete duplicate tools/ files * cleanup: use fp32 rope (non-fused) from main * cleanup: no longer block out GQA codepaths in conversion scripts * cleanup: gqa code a bit * add llama2, llemma configs * add non-flash GQA ; refactor modeling code * clean up mistral config for commit * further cleanup configs dir * remove slurm script from llemma * update seqlen params for codellama, llemma configs * add more comments to GQA code, and make reshapes more readable * make inv_freq non-persistent * actually, just ensure mistral has inv_freqs as a persistent buffer * non-flash GQA works, so ensure arguments.py permits it * no longer use our own copies of flash attention interface functions * remove unused mpu util fn * delete unused config file * fix diff on mpu/utils.py * remove slurm scripts that won't be in this PR * run pre-commit * update tests for conversion scripts * add flash version check for sliding window * pre-commit --------- Co-authored-by: zhangir-azerbayev <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f36aed7 - Browse repository at this point
Copy the full SHA f36aed7View commit details
Commits on Feb 26, 2024
-
[Bug?] Fix profiling argument names (#1155)
* possibly fix profiling flag names * actually, profile_backward already exists * Update NeoXArgs docs automatically * neox_args.profile was also used some places, update that too * Update NeoXArgs docs automatically * profiling --> profile * Update NeoXArgs docs automatically * Revert neox_arguments.md changes * Update NeoXArgs docs automatically * Update gen_docs since __name__ only returns the Literal for string args with Python 3.10 * Update NeoXArgs docs automatically * Another update to preserve non-literals * Update NeoXArgs docs automatically * add union * Update NeoXArgs docs automatically * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9663802 - Browse repository at this point
Copy the full SHA 9663802View commit details
Commits on Feb 29, 2024
-
* Update cpu_ci.yml Updating the workflow to point CPU workflow towards self hosted runner versus Github provided runners * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3c03fc7 - Browse repository at this point
Copy the full SHA 3c03fc7View commit details
Commits on Mar 2, 2024
-
Improve argument validation for Flash-attn + SWA (#1162)
* Improve argument validation for Flash-attn + SWA * Update NeoXArgs docs automatically * don't pass window_size if not necessary * Update NeoXArgs docs automatically * Update 7B.yml * Update NeoXArgs docs automatically * apply precommit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 19596b0 - Browse repository at this point
Copy the full SHA 19596b0View commit details
Commits on Mar 4, 2024
-
Single node Pythia 14M training on ngc pytorch 24.02 container (#1170)
* Pythia 14M training on ngc pytorch 24.02 container * pre-commit --------- Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 119950c - Browse repository at this point
Copy the full SHA 119950cView commit details -
Remove unnecessary fp32/bf16 conversion (#1169)
* feat: remove unnecessary bf16 conversions since no collective op is performed * pre-commit --------- Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7b8187a - Browse repository at this point
Copy the full SHA 7b8187aView commit details -
Ignore markdown for pre-commit (#1171)
* ignore markdown for pre-commit * only ignore end of file and trailing whitespace * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 31cfe52 - Browse repository at this point
Copy the full SHA 31cfe52View commit details -
Make rotary freqs buffer non-persistent (#1168)
* make inv_freq non-persistent by default * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e109bf5 - Browse repository at this point
Copy the full SHA e109bf5View commit details -
Support Lion with Zero Optimizer (#1166)
* feat: deepspeed zero lion support * feat: bump DeeperSpeed version to one that includes DeepSpeed FusedLion * feat: bump DeeperSpeed version to include pipeline logging fix * pre-commit --------- Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for df8cf24 - Browse repository at this point
Copy the full SHA df8cf24View commit details
Commits on Mar 7, 2024
-
* Add DeepSpeed MoE Thanks to dayofthepenguin for extensive testing Closes #479 * Update NeoXArgs docs automatically * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 86758c3 - Browse repository at this point
Copy the full SHA 86758c3View commit details
Commits on Mar 8, 2024
-
remove
best_download
as dependency (#1179)* Update requirements.txt * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 63b9fa1 - Browse repository at this point
Copy the full SHA 63b9fa1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 90d4cb3 - Browse repository at this point
Copy the full SHA 90d4cb3View commit details -
- Eliminate already installed apt packages - sparse attn requirement lead to a triton downgrade - flash attn is already part of the ngc container (in another version that is compatible with TE)
Configuration menu - View commit details
-
Copy full SHA for 8c13642 - Browse repository at this point
Copy the full SHA 8c13642View commit details -
When using kv cache and flash attention in conjunction, it's crucial …
…to set the causal parameter of flash_varlen_qkv_fn to False. Failing to do so will lead to inaccurate results. (#1178)
Configuration menu - View commit details
-
Copy full SHA for c1fa994 - Browse repository at this point
Copy the full SHA c1fa994View commit details -
Remove gas from Pythia configs (#1181)
Fixes #1165 Co-authored-by: Yang Zhang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1e7abe7 - Browse repository at this point
Copy the full SHA 1e7abe7View commit details -
Fix moe_loss in gpt_j_residual path (#1180)
Fixes #1174 Co-authored-by: Yang Zhang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 82ddc66 - Browse repository at this point
Copy the full SHA 82ddc66View commit details
Commits on Mar 10, 2024
-
Add Mamba Architecture (#1157)
* initial mamba support (no kernels, no parallelism) * Mamba runs! Also, add flags for sel. scan and conv1d fused kernels * Update NeoXArgs docs automatically * add mamba_inner_fn ; try really hard to make A_log and D no-WD and stored in fp32 * cleanup print statements * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * add draft conversion script (tested working TP=1) * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * update parallelism checks for mamba--partition activations works * add mamba requirements * clean up and better comment mamba code * clean up and better comment mamba code * update arg validation in mamba * more cleanup * add flag for fp32 Alog/D, add init_methods support for mamba * Update NeoXArgs docs automatically * update conversion script name, add docstring * name conversion script * Update NeoXArgs docs automatically * add demo configs * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * add arguments to control conv and (in,out)_proj biases in mamba separately * Update NeoXArgs docs automatically * make x_proj bias also controlled by flag * Update NeoXArgs docs automatically * pre-commit, add comments * Update NeoXArgs docs automatically * Add mamba import print * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6809bbc - Browse repository at this point
Copy the full SHA 6809bbcView commit details
Commits on Mar 13, 2024
-
Switch to using Cuda Flash Attn for Alibi (#1183)
* add cuda support for flash attn w/ alibi, warn of deprecation of triton * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 03186de - Browse repository at this point
Copy the full SHA 03186deView commit details
Commits on Mar 15, 2024
-
Mamba + Tensor Parallel Support (#1184)
* TP works! * merge TP mamba changes with most current MambaLayer * cleanup TP, confirmed working still * make shapes with TP>1 work with conversion * tested and PP works, so no need for assert blocking it in arguments * update comment * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 277141e - Browse repository at this point
Copy the full SHA 277141eView commit details
Commits on Mar 19, 2024
-
[ZeRO-3] Partitioned init with
deepspeed.zero.Init()
(#1190)* added ds zero.Init() to get_model * Clean up conditional with block * pre-commit --------- Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7267a74 - Browse repository at this point
Copy the full SHA 7267a74View commit details
Commits on Mar 26, 2024
-
edouardoyallon committed
Mar 26, 2024 Configuration menu - View commit details
-
Copy full SHA for e6b5261 - Browse repository at this point
Copy the full SHA e6b5261View commit details -
Merge pull request #1196 from edouardoyallon/typo_readme
ENH Small typo in the README
Configuration menu - View commit details
-
Copy full SHA for 4085302 - Browse repository at this point
Copy the full SHA 4085302View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1960b66 - Browse repository at this point
Copy the full SHA 1960b66View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3616658 - Browse repository at this point
Copy the full SHA 3616658View commit details
Commits on Apr 1, 2024
-
making PR triggered CPU test for changes to megatron (#1195)
* making PR triggered CPU test for changes to megatron * Update NeoXArgs docs automatically * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 977448e - Browse repository at this point
Copy the full SHA 977448eView commit details -
[AMD] Supporting fused kernels build using JIT (#1188)
* initial JIT load functions * passing neox_arge to load() as optional for easy testing * modified headers for correct copyright statements
Configuration menu - View commit details
-
Copy full SHA for 51a7de9 - Browse repository at this point
Copy the full SHA 51a7de9View commit details -
[ZeRO-3] Ensured passing neox deepspeed_config when using partitioned…
… init (#1191) * added ds zero.Init() to get_model * Clean up conditional with block * pre-commit * ensured deepspeed configs are passed to init --------- Co-authored-by: Quentin Anthony <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 01657aa - Browse repository at this point
Copy the full SHA 01657aaView commit details