v0.12.1
π Composer v0.12.1
Composer v0.12.1 is released! Install via pip
:
pip install --upgrade mosaicml==0.12.1
New Features
-
π In-Context Learning (#1876)
With Composer and MosaicML Cloud you can now evaluate LLMs on in-context learning tasks (LAMBADA, HellaSwag, PIQA, and more) hundreds of times faster than other evaluation harnesses. Please see our "Blazingly Fast LLM Evaluation for In-Context Learning" blog post for more details!
-
πΎ Added support for Coreweave Object Storage (#1915)
Coreweave object store is compatible with
boto3
. Uploading objects to Coreweave object store is almost exactly like writing to using S3, except anendpoint_url
must be set via theS3_ENDPOINT_URL
environment variable. For example:import os os.environ['S3_ENDPOINT_URL'] = 'https://object.las1.coreweave.com' from composer.trainer import Trainer # Save checkpoints every epoch to s3://my_bucket/checkpoints trainer = Trainer( model=model, train_dataloader=train_dataloader, max_duration='10ep', save_folder='s3://my_bucket/checkpoints', save_interval='1ep', save_overwrite=True, save_filename='ep{epoch}.pt', save_num_checkpoints_to_keep=0, # delete all checkpoints locally ) trainer.fit()
Please see our checkpointing documentation for more details.
-
πͺ΅ Automatic logging of Trainer hparams (#1855)
Hyperparameter arguments passed to the
Trainer
are now automatically logged. Simply set theTrainer
argumentauto_log_hparams=True
.
Bug Fixes
- Update Docker images to use βposix_prefixβ paths (#1854)
- Disable new notebook in CI (#1875)
- [Fix] Enable logging of metrics from Callbacks to ConsoleLogging (#1884)
- Ensure loggers run init event before callbacks in Engine (#1890)
- Raise an error in FSDP meta tensor initialization if there's no initialization functions, fix associated flaky FSDP test (#1905)
- Add primitive list support (#1906)
- Add logic for shifting labels before computing metrics (#1913)
- Fixes mis specified dependency (#1919)
- pin setuptools in build requirements (#1926)
- Pin pip<23 in Docker images (#1936)
- Fix bug in trainer.eval and add test cases for test_console_logger (#1937)
What's Changed
- Rename GradMonitor -> OptimizerMonitor; add functionality to log optimizer-specific metrics to assist loss spike investigation by @bmosaicml in #1743
- Add GCS uri support for loading and saving checkpoints by @eracah in #1833
- HF factory function tests by @dakinggg in #1832
- Fix doc issue, Trainer hparam log_to_console defaults to False by @eracah in #1840
- Removed YAHP references from Docs by @bandish-shah in #1841
- Typo by @nguyenhoan1988 in #1843
- Fix source code links in docs by @bandish-shah in #1844
- add importorskip by @dakinggg in #1847
- Update Docker images to use βposix_prefixβ paths by @mvpatel2000 in #1854
- Fix typo by @standardAI in #1849
- ConsoleLogger: log first batch and first epoch when using console_log_interval by @eracah in #1860
- Simpler auto log hparams by @eracah in #1855
- Fix typos by @cclauss in #1850
- Bump sphinxext-opengraph from 0.7.3 to 0.7.4 by @dependabot in #1851
- Bump coverage[toml] from 6.5.0 to 7.0.1 by @dependabot in #1853
- Bump traitlets from 5.7.0 to 5.8.0 by @dependabot in #1852
- Bump ipython from 7.32.0 to 8.8.0 by @dependabot in #1865
- Update monai requirement from <0.10,>=0.9.1 to >=0.9.1,<1.2 by @dependabot in #1869
- Bump sphinxcontrib-katex from 0.9.3 to 0.9.4 by @dependabot in #1868
- Bump coverage[toml] from 7.0.1 to 7.0.4 by @dependabot in #1867
- Upgrade docker images to
torch==1.13.1
by @abhi-mosaic in #1863 - add more useful info to state by @dakinggg in #1848
- Feature/lambada evaluator by @bmosaicml in #1845
- multi-node distributed training, submitit & composer integration demo by @YilunKuang in #1753
- Daily tests by @mvpatel2000 in #1870
- Disable new notebook in CI by @mvpatel2000 in #1875
- Update deepspeed by @mvpatel2000 in #1864
- fix fail fast in daily by @mvpatel2000 in #1880
- Fix getting started docs by @mvpatel2000 in #1878
- Speed up test_lm_task_evaluation by @mvpatel2000 in #1879
- Fix unprotected import by @mvpatel2000 in #1874
- add ignore_modules to fsdp by @vchiley in #1877
- Change vision image by @mvpatel2000 in #1881
- Fix eval_forward in the ComposerModel ABC by @eracah in #1871
- Fix fsdp weight tying by @bcui19 in #1856
- Bump pytest from 7.2.0 to 7.2.1 by @dependabot in #1886
- Bump ipykernel from 6.19.2 to 6.20.1 by @dependabot in #1887
- Bump gitpython from 3.1.28 to 3.1.30 by @dependabot in #1888
- Update Vision Image in Pytest by @mvpatel2000 in #1882
- Streaming data tests by @dakinggg in #1842
- Add NLP Algorithms Tests by @nik-mosaic in #1839
- rename HF notebook by @dakinggg in #1873
- Ensure loggers run init event before callbacks in Engine by @eracah in #1890
- [Fix] Enable logging of metrics from Callbacks to ConsoleLogging by @eracah in #1884
- Updating how we load metrics in a state_dict so we don't add extra memory overhead by @bcui19 in #1892
- Getting daily tests passing by @dakinggg in #1893
- Bump nbsphinx from 0.8.10 to 0.8.12 by @dependabot in #1897
- Fix docker image by @mvpatel2000 in #1894
- Add primitive list support by @mvpatel2000 in #1906
- Raise an error in FSDP
meta
tensor initialization if there's no initialization functions, fix associated flaky FSDP test by @bcui19 in #1905 - Gpu Test by @mvpatel2000 in #1907
- Update docker with FFCV fix by @mvpatel2000 in #1908
- Restore GPU tests by @mvpatel2000 in #1909
- Update workflow names by @mvpatel2000 in #1910
- Enable daily gpu tests by @mvpatel2000 in #1911
- Tweak daily GPU tests by @mvpatel2000 in #1912
- Daily GPU Tests -- Change to Git Commit by @mvpatel2000 in #1914
- Add logic for shifting labels before computing metrics by @alextrott16 in #1913
- Add coreweave object store support. by @eracah in #1915
- Fixes mis specified dependency by @dakinggg in #1919
- Bump coverage[toml] from 7.0.4 to 7.1.0 by @dependabot in #1923
- Update importlib-metadata requirement from <6,>=5.0.0 to >=5.0.0,<7 by @dependabot in #1921
- pin setuptools in build requirements by @dakinggg in #1926
- Remove synthetic testing infrastructure for HF/NLP by @dakinggg in #1895
- Add upgrade flags to pip installs by @dakinggg in #1916
- Temporarily pin pip to <23 by @dakinggg in #1930
- add link protection by @mvpatel2000 in #1927
- Cleaning up error checking for FSDP sharding strategies with fp32 precision by @bcui19 in #1925
- Fix mcp script to avoid follow by @mvpatel2000 in #1932
- Emit Eval progress in console logging by @eracah in #1917
- Remove Fused LayerNorm deprecation by @nik-mosaic in #1931
- Add EFA Support for Multinode in AWS by @mvpatel2000 in #1891
- remove jenkins gpu tests by @mvpatel2000 in #1933
- Typo due to stale MCLI docs by @mvpatel2000 in #1934
- Pin pip<23 in Docker images by @bandish-shah in #1936
- Fix bug in trainer.eval and add test cases for test_console_logger by @eracah in #1937
- Add few shot and multiple choice to ICL evaluation by @bmosaicml in #1876
- Disable test_streaming_datasets in pytest-daily by @bandish-shah in #1939
New Contributors
- @bmosaicml made their first contribution in #1743
- @nguyenhoan1988 made their first contribution in #1843
- @standardAI made their first contribution in #1849
- @cclauss made their first contribution in #1850
- @YilunKuang made their first contribution in #1753
- @vchiley made their first contribution in #1877
Full Changelog: v0.12.0...v0.12.1