Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge upstream #64

Merged
merged 117 commits into from
Apr 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
eda5aae
Fixed final value of cosine decay lr (#1011)
kshitijkg Aug 12, 2023
01b5e22
turn hyphens into underscores in merge20b.py (#986)
wunderalbert Aug 12, 2023
d8bcd97
Update Dockerfile (#1014)
xu-song Aug 22, 2023
43ea51c
README Update (#1017)
StellaAthena Aug 28, 2023
2922bef
Bump transformers version and update enwik8 link (#1024)
dashstander Sep 13, 2023
960ed3d
Fix Generation with Sequential Model (#1026)
xu-song Sep 15, 2023
97e376c
Fix broken link (#1022)
StellaAthena Sep 15, 2023
7821aa7
add llama training config (#1023)
xu-song Sep 15, 2023
cdc94ee
Create README_llama.md (#1027)
Quentin-Anthony Sep 15, 2023
737c913
Rename README_llama.md to README.md (#1028)
Quentin-Anthony Sep 15, 2023
c883e8c
Add llama generation script (#1030)
xu-song Sep 15, 2023
d9166bf
Fix bf16 for zero > 0 and pipeline parallelism > 0 (#1032)
dashstander Sep 18, 2023
fcd5f92
Remove support for lazy dataset implementation (#1033)
dashstander Sep 18, 2023
70af6e8
Fix SequentialWrapper Generation (pipe_parallel_size = 0) (#1031)
xu-song Sep 18, 2023
8903a96
integrated flash attention 2 (#1035)
a663E-36z1120 Sep 20, 2023
0ce77ab
Fix register_buffer parameter (#1036)
xu-song Sep 20, 2023
444c0ef
Add flash 2.x message to README.md (#1037)
Quentin-Anthony Sep 20, 2023
f9503b7
Add s3 checkpoint syncing (#1010)
haileyschoelkopf Sep 23, 2023
390d37c
Fixed final value of linear decay lr (#1039)
foggy-frost-forest Sep 23, 2023
e431ff5
Fix final value of exponential decay lr (#1040)
Quentin-Anthony Sep 23, 2023
2ab05be
Remove the NeoX implementation of GPT2Tokenizer (#1042)
dashstander Sep 25, 2023
3bfedf4
Pre-compute RoPE embeddings in fp32 (#1041)
dashstander Sep 25, 2023
ba51ca0
Patch LR Annealing Bug (#1046)
dashstander Sep 27, 2023
5f36401
Improve FLOPS Calculation (#1044)
dashstander Sep 27, 2023
5fa85ad
adding boilerplate coverity scan to submit to public analysis (#1047)
jaimemcc-intel Sep 28, 2023
f44db66
Add section to the README detailing how to start distributed jobs (#1…
dashstander Sep 29, 2023
2c60645
Fix readme typos (#1049)
Quentin-Anthony Sep 29, 2023
b14d6f7
Update citation list (#1052)
Quentin-Anthony Sep 29, 2023
93cac79
Update CITATION.cff (#1053)
Quentin-Anthony Sep 29, 2023
7a8569f
Remove duplicated hf_config (#1054)
xu-song Oct 1, 2023
3f43f07
Organize the `tools` directory (#1055)
dashstander Oct 2, 2023
f6ac04d
Add documentation about using labelled datasets (#1056)
dashstander Oct 4, 2023
e001a04
LR scheduler fix no longer breaks inference (#1060)
dashstander Oct 17, 2023
b02d989
Lion Optimizer (#1062)
andylolu2 Oct 20, 2023
e277bc7
fix lion optimizer documentation (#1067)
jahatef Oct 31, 2023
f574f22
Fix preprocess_data.py link (#1064)
Quentin-Anthony Oct 31, 2023
fcc5af5
Edge-casing for multi-GPU HF-to-NeoX conversion (#1065)
haileyschoelkopf Nov 1, 2023
8c9fc00
Create tools __init__.py for import (#1068)
Quentin-Anthony Nov 1, 2023
a10f69c
Pin version of `lm_eval` (#1070)
haileyschoelkopf Nov 1, 2023
41f019e
fixed case when ntasks_per_node is used instead (#1069)
AIproj Nov 1, 2023
90aa131
Update README.md
StellaAthena Nov 5, 2023
04dc2ba
When processing mlp.dense_4h_to_h.bias and attention.dense.bias, tp_r…
kyuheejang Nov 7, 2023
f214358
Merge pull request #1072 from kyuheejang/Fixing-neox-to-huggingface
StellaAthena Nov 7, 2023
d8028f8
Resolve error in the `test_neoxargs_usage` unit test (#1074)
mkerin Nov 8, 2023
10bf788
Update neox_args.py (#1081)
jahatef Nov 16, 2023
f48d3a6
Update README.md (#1082)
StellaAthena Nov 22, 2023
efea81f
Update README.md
StellaAthena Nov 30, 2023
3be59a4
Extend ci suite (#1080)
mkerin Dec 4, 2023
a2b2020
Patch coverity scan (#1090)
jaimemcc-intel Dec 4, 2023
050f560
Corrects FLOPs formula as per 1093 (#1094)
StellaAthena Dec 6, 2023
f19b2ec
Update CODEOWNERS
StellaAthena Dec 19, 2023
07166da
Bump transformers from 4.30.2 to 4.36.0 in /requirements (#1097)
dependabot[bot] Dec 20, 2023
9283eff
Pins old DeeperSpeed until bug is fixed (#1095)
StellaAthena Dec 20, 2023
9eef954
Update README.md
StellaAthena Dec 22, 2023
a48e09e
Update README.md
StellaAthena Dec 22, 2023
613e5a6
Update NeoXArgs docs automatically
invalid-email-address Dec 22, 2023
be7eeda
Update README.md
StellaAthena Dec 22, 2023
2117afc
Update README.md
StellaAthena Dec 22, 2023
8dba5b6
Update NeoXArgs docs automatically
invalid-email-address Dec 22, 2023
f161245
Add QK Normalization (#1100)
lintangsutawika Dec 22, 2023
7fb3b3c
Update README.md
StellaAthena Dec 22, 2023
a7509f0
Update README.md
StellaAthena Dec 22, 2023
8eaac4e
Merge branch 'main' into StellaAthena-patch-4-1
StellaAthena Dec 22, 2023
4d5a811
Update NeoXArgs docs automatically
invalid-email-address Dec 22, 2023
05cc29c
Merge pull request #1099 from EleutherAI/StellaAthena-patch-4-1
StellaAthena Dec 22, 2023
e25446e
Merge branch 'main' into StellaAthena-patch-4
StellaAthena Dec 22, 2023
287f9f7
Merge pull request #1102 from EleutherAI/StellaAthena-patch-4
StellaAthena Dec 22, 2023
b27e409
Lm eval 0.4.0 support (#1101)
haileyschoelkopf Dec 23, 2023
1148a0f
Update README.md
StellaAthena Dec 23, 2023
e5a7ea7
Update neox_args.py (#1107)
StellaAthena Dec 26, 2023
eca6b1a
Fix repo for CI (#1106)
yang Jan 4, 2024
98716eb
Fix install, Dockerfile, CI (#1104)
yang Jan 4, 2024
77605ca
Fused Rotary Embeddings (fixed) (#1108)
yang Jan 5, 2024
f14782a
Add pythia 14M and 31M configs (#1111)
segyges Jan 5, 2024
e6e944a
Add docker compose and change containerized setup instructions to use…
segyges Jan 9, 2024
92b1b6f
Fix openwebtext2 downloader, backport improvements to DataDownloader …
segyges Jan 11, 2024
90f70ff
Bump jinja2 from 3.1.2 to 3.1.3 in /requirements (#1120)
dependabot[bot] Jan 13, 2024
6399155
Enable passing of `--account` to `srun` / SlurmLauncher (#1126)
haileyschoelkopf Jan 19, 2024
7a8fa2f
update copyrights (#1128)
jahatef Jan 24, 2024
3d8fec0
fused layernorm (#1105)
yang Jan 26, 2024
e5602c3
Contributing Guide (#1138)
jahatef Jan 29, 2024
1c133bf
moved eval import and added to docs (#1139)
R0n12 Jan 30, 2024
032ec8c
Update lm_eval v0.4 to PyPI dependencies (#1141)
haileyschoelkopf Feb 1, 2024
91c44bc
Remove gas (beano) (#1144)
segyges Feb 5, 2024
f7373f8
Improve Conversion Utilities (#1124)
haileyschoelkopf Feb 8, 2024
412cf6e
Fixes distributed tests, and skips tests that are broken. (#1149)
jahatef Feb 21, 2024
46d179c
Memory profiling (#1153)
jahatef Feb 21, 2024
eee03b2
add profiling to readme (#1154)
jahatef Feb 23, 2024
a7638a8
Python version update (#1122)
segyges Feb 23, 2024
72d1803
Minor changes (#1125)
segyges Feb 23, 2024
f36aed7
Draft PR Adding mistral 0.1 (#1131)
AIproj Feb 23, 2024
9663802
[Bug?] Fix profiling argument names (#1155)
haileyschoelkopf Feb 26, 2024
3c03fc7
Update cpu_ci.yml (#1159)
jaimemcc-intel Feb 29, 2024
19596b0
Improve argument validation for Flash-attn + SWA (#1162)
haileyschoelkopf Mar 2, 2024
119950c
Single node Pythia 14M training on ngc pytorch 24.02 container (#1170)
tf-nv Mar 4, 2024
7b8187a
Remove unnecessary fp32/bf16 conversion (#1169)
DayOfThePenguin Mar 4, 2024
31cfe52
Ignore markdown for pre-commit (#1171)
Quentin-Anthony Mar 4, 2024
e109bf5
Make rotary freqs buffer non-persistent (#1168)
haileyschoelkopf Mar 4, 2024
df8cf24
Support Lion with Zero Optimizer (#1166)
DayOfThePenguin Mar 4, 2024
86758c3
Add MoE (#1129)
yang Mar 7, 2024
63b9fa1
remove `best_download` as dependency (#1179)
haileyschoelkopf Mar 8, 2024
90d4cb3
Fix documentation for --jsonl-keys argument of preprocess_data script…
KeitaW Mar 8, 2024
8c13642
clean up dockerfile: (#1175)
tf-nv Mar 8, 2024
c1fa994
When using kv cache and flash attention in conjunction, it's crucial …
chaochen99 Mar 8, 2024
1e7abe7
Remove gas from Pythia configs (#1181)
yang Mar 8, 2024
82ddc66
Fix moe_loss in gpt_j_residual path (#1180)
yang Mar 8, 2024
6809bbc
Add Mamba Architecture (#1157)
haileyschoelkopf Mar 10, 2024
03186de
Switch to using Cuda Flash Attn for Alibi (#1183)
haileyschoelkopf Mar 13, 2024
277141e
Mamba + Tensor Parallel Support (#1184)
haileyschoelkopf Mar 15, 2024
7267a74
[ZeRO-3] Partitioned init with `deepspeed.zero.Init()` (#1190)
R0n12 Mar 19, 2024
e6b5261
Small typo in the README
Mar 26, 2024
4085302
Merge pull request #1196 from edouardoyallon/typo_readme
StellaAthena Mar 26, 2024
1960b66
Added more papers
StellaAthena Mar 26, 2024
3616658
Update README.md
StellaAthena Mar 26, 2024
977448e
making PR triggered CPU test for changes to megatron (#1195)
jaimemcc-intel Apr 1, 2024
51a7de9
[AMD] Supporting fused kernels build using JIT (#1188)
R0n12 Apr 1, 2024
01657aa
[ZeRO-3] Ensured passing neox deepspeed_config when using partitioned…
R0n12 Apr 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1 +1 @@
* @EleutherAI/pm-gptneo
* @Quentin-Anthony
60 changes: 60 additions & 0 deletions .github/workflows/coverity_scan.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
name: Coverity
on:
workflow_dispatch:
inputs:
build_version:
description: "Version of GPT-NeoX being submitted for scan"
required: false
default: "GPT-NeoX build version"
build_description:
description: "Description of the current build"
required: false
default: "Current build of GPT-NeoX"

jobs:
coverity:

runs-on: ubuntu-latest

env:
COV_USER: ${{ secrets.COV_USER }}
COVERITY_PROJECT: ${{ secrets.COVERITY_PROJECT }}
COVERITY_TOKEN: ${{ secrets.COVERITY_TOKEN }}

steps:
- uses: actions/checkout@v2
with:
path: gpt-neox

- name: Install utils
run: |
sudo apt update -y && sudo apt upgrade -y
sudo apt install curl jq wget -y

- name: Coverity Download
run: |
wget https://scan.coverity.com/download/linux64 --post-data "token=$COVERITY_TOKEN&project=$COVERITY_PROJECT" -O coverity_tool.tgz --no-verbose
mkdir $GITHUB_WORKSPACE/coverity && tar xvf coverity_tool.tgz -C $GITHUB_WORKSPACE/coverity --strip-components=1
$GITHUB_WORKSPACE/coverity/bin/cov-configure --python
$GITHUB_WORKSPACE/coverity/bin/cov-configure --gcc

- name: Coverity Scan and Upload
run: |
set -x
pushd $GITHUB_WORKSPACE
cd $GITHUB_WORKSPACE/gpt-neox
$GITHUB_WORKSPACE/coverity/bin/cov-build --dir $GITHUB_WORKSPACE/cov-int --no-command --fs-capture-search ./
popd
tar caf build-results.bz2 cov-int
curl --form token=$COVERITY_TOKEN \
--form email=$COV_USER \
--form [email protected] \
--form version="${{ inputs.build_version }}" \
--form description="${{ inputs.build_description }}" \
https://scan.coverity.com/builds?project=$COVERITY_PROJECT

- name: Upload Scan Build as Artifact
uses: actions/upload-artifact@v3
with:
name: coverity-build-${{ github.sha }}
path: build-results.bz2
3 changes: 2 additions & 1 deletion .github/workflows/cpu_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ on: "push"

jobs:
run-tests:
runs-on: ubuntu-latest
#runs-on: ubuntu-latest
runs-on: [ 'test', 'self-hosted' ]
steps:
- uses: actions/checkout@v3

Expand Down
69 changes: 69 additions & 0 deletions .github/workflows/cpu_ci_on_pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: "Pull Request CPU Tests"

on:
pull_request:
paths: # job only triggers when the PR changes files under megatron directory
- "megatron/**"

jobs:
run-tests:
runs-on: [ 'test', 'self-hosted' ]
steps:
- name: Checkout Repo
uses: actions/checkout@v4

- name: Install Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
cache: "pip"
cache-dependency-path: "**/requirements*.txt"

- name: Upgrade Pip
run: python -m pip install --upgrade pip

- name: Set up Docker repository # this should possibly be done by the worker before the job starts in the interest of execution time?
run: |
# Add Docker's official GPG key:
sudo apt-get update -y
sudo apt-get install ca-certificates curl -y
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
- name: Docker installation # this should possibly be done by the worker before the job starts in the interest of execution time?
run: |
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y
sudo docker run hello-world
- name: Prepare data
run: |
python prepare_data.py -d ./data
- name: Remove previous container
run: |
if docker ps -a | grep -q "$CONTAINER"; then
echo "Container already exists, deleting it..."
docker rm -f $CONTAINER
fi
- name: Create container
run: |
export NEOX_DATA_PATH='./data/enwik8'
export NEOX_CHECKPOINT_PATH='/mnt/sda/checkpoints' #todo: where do I get this?
docker compose run -d --build --name $CONTAINER gpt-neox tail -f /dev/null
- name: Install test requirements
run: |
docker exec $CONTAINER pip install -r /workspace/requirements-dev.txt
- name: Execute CPU tests 1
run: |
docker exec $CONTAINER sh -c "cd gpt-neox && pytest tests -m cpu"
- name: Execute CPU tests 2
run: |
docker exec $CONTAINER sh -c "cd gpt-neox && PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python pytest tests -m cpu"
- name: Generate report
run: |
docker exec $CONTAINER python -m http.server --directory htmlcov 8000
15 changes: 12 additions & 3 deletions .github/workflows/pull_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,27 @@ on: [pull_request]

jobs:
pre-commit:
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v4
with:
python-version: 3.8
python-version: 3.10
cache: "pip"
cache-dependency-path: "**/requirements*.txt"
# Need the right version of clang-format
- run: pip install -r requirements/requirements-dev.txt
- uses: pre-commit/[email protected]
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
-
name: Docker build
id: docker_build
uses: docker/build-push-action@v2

update-documentation:
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
with:
Expand Down
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ data/**/*.bin
data/**/*.json*
data/**/*.txt
data/**/*.gz
data/**/*.zip
data/**/*.np*
data/**/*.npy
checkpoints/
Expand All @@ -150,3 +151,7 @@ test_logs/
logs/
tensorboard/
src/

# test data files
tests/data/*.bin
tests/data/*.idx
8 changes: 5 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,19 @@ repos:
- id: check-yaml
- id: destroyed-symlinks
- id: end-of-file-fixer
exclude: docs/CNAME
exclude: ^(docs/CNAME/|configs/neox_arguments.md)
- id: fix-byte-order-marker
- id: fix-encoding-pragma
args: [--remove]
- id: mixed-line-ending
args: [--fix=lf]
- id: requirements-txt-fixer
- id: trailing-whitespace
- repo: https://gitlab.com/daverona/pre-commit-cpp
exclude: ^(docs/CNAME/|configs/neox_arguments.md)
- repo: https://gitlab.com/daverona/pre-commit/cpp
rev: 0.8.0
hooks:
- id: clang-format # formatter of C/C++ code based on a style guide: LLVM, Google, Chromium, Mozilla, and WebKit available
- id: clang-format # formatter of C/C++ code based on a style guide: LLVM, Google, Chromium, Mozilla, and WebKit available
args: []

- repo: https://github.com/psf/black
Expand All @@ -36,3 +37,4 @@ repos:
--check-filenames,
--check-hidden,
]
exclude: tests/data/hf_cache/tokenizer/gpt2.json
24 changes: 21 additions & 3 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ authors:
- affiliation: EleutherAI
family-names: Andonian
given-names: Alex
- affiliation: EleutherAI
family-names: Anthony
given-names: Quentin
- affiliation: EleutherAI
family-names: Biderman
given-names: Stella
Expand Down Expand Up @@ -34,15 +37,30 @@ authors:
- affiliation: EleutherAI
family-names: Pieler
given-names: Michael
- affiliation: EleutherAI
family-names: Phang
given-names: Jason
- affiliation: EleutherAI
family-names: Purohit
given-names: Shivanshu
- affiliation: EleutherAI
family-names: Schoelkopf
given-names: Hailey
- affiliation: EleutherAI
family-names: Stander
given-names: Dashiell
- affiliation: EleutherAI
family-names: Songz
given-names: Tri
- affiliation: EleutherAI
family-names: Phil
given-names: Wang
family-names: Tigges
given-names: Curt
- affiliation: EleutherAI
family-names: Thérien
given-names: Benjamin
- affiliation: EleutherAI
family-names: Wang
given-names: Phil
- affiliation: EleutherAI
family-names: Weinbach
given-names: Samuel
Expand All @@ -55,7 +73,7 @@ license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "0.0.1"
version: "2.0.0"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...
86 changes: 86 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Contributing
GPT-NeoX welcomes your contributions!

## Prerequisites
GPT-NeoX uses [pre-commit](https://pre-commit.com/) to ensure that formatting is
consistent across GPT-NeoX. First, ensure that `pre-commit` is installed with
`pip install pre-commit`. Next, the pre-commit hooks must be installed once
before commits can be made:
```bash
pre-commit install
```
Please install `clang-format` from Conda:
```bash
conda install clang-format
```

Afterwards, our suite of formatting tests run automatically before each `git commit`. You
can also run these manually:
```bash
pre-commit run --all-files
```
If a formatting test fails, it will fix the modified code in place and abort
the `git commit`. After looking over the changes, you can `git add <modified files>`
and then repeat the previous `git commit` command.


## Testing
GPT-NeoX tracks two types of tests: unit tests and more costly model convergence tests.
Unit tests are found in `tests/unit/` and the model convergence tests are found in
`tests/model/`.

### Unit Tests
[PyTest](https://docs.pytest.org/en/latest/) is used to execute tests. PyTest can be
installed from PyPI via `pip install pytest`. Simply invoke `pytest --forked` to run the
unit tests:
```bash
pytest --forked tests/unit/
```
You can also provide the `-v` flag to `pytest` to see additional information about the
tests. Note that [pytest-forked](https://github.com/pytest-dev/pytest-forked) and the
`--forked` flag are required to test CUDA functionality in distributed tests.

### Model Tests
To execute model tests, first install GPT-NeoX. Next, execute the model test driver:
```bash
cd tests/model/
pytest run_sanity_check.py
```
Note that the `--forked` flag is not necessary for the model tests.

## Contributor License Agreement
This project welcomes contributions and suggestions. Most contributions require you to
agree to a Contributor License Agreement (CLA) declaring that you have the right to, and
actually do, grant us the rights to use your contribution. For details, visit
https://cla-assistant.io/EleutherAI/gpt-neox.

When you submit a pull request, a CLA bot will automatically determine whether you need
to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply
follow the instructions provided by the bot. You will only need to do this once across
all repos using our CLA.

## New Feature Contribution Guidelines
Unlike bug fix or improving existing feature (where users usually directly submit a PR and we review it), adding a new feature to GPT-NeoX requires several steps: (1) proposal and discussion, (2) implementation and verification, (3) release and maintenance. This general guideline applies to all new feature contributions. Core GPT-NeoX team member contributions may complete step 1 internally.

### Step 1: Proposal and Discussion
We ask users to first post your intended feature in an issue. This issue needs to include:

* A description of the proposed feature.
* A motivation of why it will be useful to GPT-NeoX users.
* A rough design of how you implement the feature inside GPT-NeoX.
* (Important) Results or planned experiments to demonstrate the effectiveness and correctness of the feature.
* If the feature only affects performance and does not affect training convergence, we require testing on a fraction of training to demonstrate that the training/validation loss are consistent with baseline, and that the performance is better than baseline.
* If the feature does affect training convergence, we require testing the whole training to demonstrate that the feature achieves better/on-par final model quality and training performance compared to baseline.

Based on the issue we shall discuss the merit of the new feature and decide whether to accept or decline the proposal. Once accepted and after we confirm the design and implementation plan, we are ready for step 2.

### Step 2: Implementation and Verification
The contributor will proceed and implement the feature, and the GPT-NeoX team will provide guidance/helps as needed. The required deliverables include:

* A PR to [EleutherAI/GPT-NeoX](https://github.com/EleutherAI/gpt-neox) including (1) the feature implementation (2) unit tests (3) documentation (4) example usage.
* In the implementation (code, documentation, tutorial), we require the feature author to record their GitHub username as a contact method for future questions/maintenance.

After receiving the PRs, we will review them and merge them after necessary tests/fixes.

### Step 3: Release and Maintenance
After the PRs are merged, we will announce the feature on our website (with credit to the feature author). We ask the feature author to commit to the maintenance of the feature.
Loading
Loading