Skip to content

Latest commit

 

History

History
32 lines (23 loc) · 4.2 KB

REFERENCES.md

File metadata and controls

32 lines (23 loc) · 4.2 KB

References

Collection of articles, papers and techniques considered in this project.

  • Knowledge Distillation: Teacher is GPT-4. Generate great Go samples on which to fine-tune the model.

Papers

Name Description
Evaluating Large Language Models Trained on Code Foundation Codex paper. HumanEval benchmark and introduction to large language models for Code
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X 13B multilingual model based on GPT
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference A a modified transformer encoder that increases the throughput for masked language model pretraining by more than 2 times

Articles

Name Description
How to train your own Large Language Models General advice in training code LLMs. Good advice on data pipeline.
MosaicBERT: Pretraining BERT from Scratch for $20 Optimised BERT training recipe

Libraries

Name Description
Microsoft DeepSpeed Efficient and fast training of LLMs. Optimized Transformer layers and backprop implementation
NVIDIA FasterTransformer Optimized inference of Transformer models
OpenAI Triton Language and compiler for writing highly efficient custom Deep-Learning primitives on GPUs
Torch Compile Compiles PyTorch models into optimised kernels.
PyTorch 2.0 Nightly Release Includes Triton support when using Torch Compile