Skip to content

Releases: determined-ai/determined

0.14.0

05 Feb 21:39
Compare
Choose a tag to compare

Changelog

9ee2fa4 chore: bump version: 0.14.0rc4 -> 0.14.0
e7da518 docs: Minor edits to release notes for 0.14.0.
31c3ad4 edit
b09f452 edit
3f57533 Tweak release notes.
90cdf6e Tweaks for release notes.
def0d9b docs: Release notes for 0.14.0.
82ffa7e chore: bump version: 0.14.0rc3 -> 0.14.0rc4
55a3c22 chore: revert default images and framework versions (#1936)
5c9081b chore: bump version: 0.14.0rc2 -> 0.14.0rc3
94820c3 docs: edit model debug doc (#1925)
ef702dc fix: correct the comparison function when numbers are fractions [DET-4969] (#1924)
8369e11 refactor: paginate experiment trials [DET-4900, DET-4921, DET-4922] (#1892)
017764f fix: correct cancel confirm button label to confirm [DET-4966] (#1922)
77740fa fix: buffer the trial log in the correct order [DET-4931] (#1912)
1ddb99b chore: bump version: 0.14.0rc1 -> 0.14.0rc2
21c704d chore: improve resource pool details presentation [DET-4968] (#1926)
836414b fix: remove clickable style from trial info table [DET-4967] (#1923)
b03a114 fix: add default query limit and add missing sort by state [DET-4919] (#1921)
fd94f24 docs: fix incorrect reference in docs (#1919)
7a1f051 fix: typos in model debugging doc (#1918)
2b0031e fix: fix a utilization calculation error in hgi resource bar for cpu slots [DET-4913] (#1911)
5de2710 docs: clean up resource pool docs and add release notes (#1917)
4f6b6ba feat: support resource pools in det-deploy local agent-up [DET-4938] (#1906)
6c3150b refactor: update active experiments [DET-4915] (#1910)
4a86995 chore: bump version: 0.14.0rc0 -> 0.14.0rc1
024b9fa fix: correct best and latest metric sort by params for the GET experiment trials API (#1915) [DET-4920]
733dca8 fix: add non-scalar metric expectation to protobufs [DET-4893] [DET-4911] (#1876)
fb4e119 feat: support more fields to sortBy in /api/v1/experiments/trials [DET-4219, DET-4920] (#1899)
a152fd3 chore: Bump images and versions to Tensorflow 2.4.1 (#1913)
099d5b2 chore: let CLI verify the master using combined system/custom certs (#1859) [DET-4666]
e25c054 docs: add model debug doc (#1895)
c0dd89e chore: reword resource pool ui presentation [DET-4925] (#1898)
6f84f76 chore: bump version: 0.14.0.dev0 -> 0.14.0rc0
10af5ee chore: bump version: 0.13.14 -> 0.14.0.dev0
e933032 chore: bump version: 0.13.14rc0 -> 0.13.14
fe7973b chore: bump version: 0.13.14.dev0 -> 0.13.14rc0
22be8c5 revert: "Revert "fix: migrate trial log ID to bigint (#1792)" (#1901)" (#1902)
1e0948d Revert "fix: migrate trial log ID to bigint (#1792)" (#1901)
4a262bc chore: save user preference for cluster view [DET-4926] (#1896)
63abc66 fix: migrate trial log ID to bigint (#1792)
142b9d1 feat: show resource pools without connected agents [DET-4924] (#1897)
6987f34 chore: move task messages to sproto (#1891)
24a12bc docs: add topic guide for commands and shells (#1886) [DET-4901]
c353fa1 feat: Documentation and CI of support for NVIDIA A100's and Google A2 instances (#1888)
fa85d33 chore: fix type errors in IPC code (#1885)
058ba7d docs: improve resource pool docs (#1865)
702add8 chore: update trials API name (#1873)
e434619 fix: render resource pools in order (#1883)
55885bc chore: Upgrading environment and dependencies to PyTorch 1.7 and TensorFlow 2.4 (#1851)
ea9abae fix: don't lose logs of short-lived commands (#1882) [DET-4907]
2b8a99c fix: trial hangs when it fails to write to the DB (#1877)
d0b88fe fix: allow NULL trials.request_id for backwards compat (#1881)
434094b ci: tolerate longer time for concurrent log uploading [DET-4908] (#1879)
facd721 ci: fix kubernetes configuration resolution [DET-4909] (#1880)
fbe48c8 fix: CI failures caused by resource pool merge (#1864)
9a0d051 perf: move experiment API filtering and pagination to database [DET-4770] (#1803)
6be4a49 fix: altered tf.config function call to be compatible with tf 1.15 [DET-4852] (#1836)
e61415f build: fix for make check-schemas on non-GNU build machines (#1871)
d4a6a9a chore: update CLI commands to display resource pool [DET-4677] (#1709)
85267b5 docs: clarify preemptible instance doc for static and dynamic agents (#1870)
811b7dc chore: expose det-deploy AWS profile support [DET-4891] (#1868)
4bc23a9 chore: restart in-progress HP importance computation on master restart [DET-4675] (#1844)
d114d46 chore: remove the deprecated PyTorch API [DET-3262] (#1784)
d97735f fix: allow larger gRPC response bodies (#1869)
f987602 feat: enable new hgi-aware cluster page [DET-4854] (#1855)
7d4f248 build: avoid re-downloading codegen binary (#1838)
916be66 fix: handle learning curve edge cases [DET-4832] (#1827)
3810fc5 ci: check that Go dependencies are tidied (#1852)
93332bf test: print start time for each E2E test (#1867)
b7721a4 fix: order migrations in the order they landed (#1866)
71aa544 expand details in resource pool modal [DET-4884] (#1857)
54fd6fc feat: add resource pools (#1846)
05266cd docs: Release notes for 0.13.13. (#1843)
e657379 chore: bump version: 0.13.13.dev0 -> 0.13.14.dev0
5e88f2b feat: swap master restart to be snapshot based (#1745) [DET-816]
0db30c6 fix: don't set default trial log limit in CLI (#1856)
a8d4dd2 fix: fix CLI log tailing with elastic (#1853) [DET-4883]
354cdfa fix: another place scheduler config for resource pool not being inherited (#1854)
831235e feat: add resource pool column to tasks list (#1831)
a1821c8 feat: add resource pool column to experiment list (#1819)
ae8011c fix: webui trial logs should not use negative offset (#1845)
d812ab7 chore: connect HGI UI to its API [DET-4638] (#1837)
4fb65a4 fix: scheduler config for resource pool not being inherited (#1847)
6523a8c chore: update cluster utilization overview [DET-4346] (#1788)
9f57c9d docs: fix readme to clarify gpu vs cpu
5f661c8 chore: add custom error for torch's ReduceLROnPlateau (#1849)
2617e74 chore: bump taiko-video version to fix ffmpeg / screenshot save race condition (#1850)
2f2e5ee perf: index as few log fields as possible to increase elasticsearch ingest speed (#1848)
c6b1f0e fix: increase trial log timestamp resolution to support milliseconds [DET-4861] (#1841)
f9d31f9 chore: enable some more Go linters (#1839)
b4b1fe2 chore: retry for more errors when uploading to GCS (#1794)
2d2e96e chore: fix duplicates in elastic log ids (#1834)
79fe5ea chore: add missing apiKey update to internal streaming sdk (#1833)
57a9acc chore: update storybook to resolve github security vulnerability for highlight.js (#1808)
342527c chore: fix trial log following logic (#1832) [DET-4850]
91e2800 chore: Endpoint and infrastructure for hyperparameter importance computation [DET-4464] (#1707)
adc4361 chore: experiment API returns resource pool info [DET-4572] (#1711)
9a6da7a chore: fix ExitedReason log (#1829)
cf3accb fix: dars_penntreebank_pytorch example [DET-4841] (#1822)
513136d fix: show zoom out tip when zoomed into learning curve chart (#1828)
128531e chore: Revert DET-4688, do not support single-trial experiments in trials-sample endpoint [DET-4840] (#1824)
0b24334 fix: update model def button to be a raw link (#1826)
79df210 chore: various elastic fixes (#1825) [DET-4839]
b8d9e20 fix: update types to support new log levels (#1823)
9ae41f5 fix: revert broken user-facing change with experiment config logic (#1821)
b021498 docs: Add Lunch and Learn promotion to README.md (#1815)

Docker images

  • docker pull determinedai/determined-master:0.14.0
  • docker pull determinedai/determined-master:9ee2fa43
  • docker pull determinedai/determined-master:9ee2fa4321ff127bd0a08a90d15fa524d73b597c
  • docker pull determinedai/determined-dev:determined-master-9ee2fa43
  • docker pull determinedai/determined-dev:determined-master-9ee2fa4321ff127bd0a08a90d15fa524d73b597c

0.13.13

26 Jan 00:19
Compare
Choose a tag to compare

Changelog

d352a9b chore: bump version: 0.13.13rc5 -> 0.13.13
4da0d0e docs: Release notes for 0.13.13. (#1843)
d982d2a chore: bump version: 0.13.13rc4 -> 0.13.13rc5
f4471de fix: don't set default trial log limit in CLI (#1856)
ec92844 fix: fix CLI log tailing with elastic (#1853) [DET-4883]
c5513cf fix: another place scheduler config for resource pool not being inherited (#1854)
a7e0709 chore: bump version: 0.13.13rc3 -> 0.13.13rc4
e2a7846 perf: index as few log fields as possible to increase elasticsearch ingest speed (#1848)
0a164fc fix: scheduler config for resource pool not being inherited (#1847)
c4dc32f fix: webui trial logs should not use negative offset (#1845)
bec1579 chore: bump version: 0.13.13rc2 -> 0.13.13rc3
64a9ff2 chore: fix duplicates in elastic log ids (#1834)
882b7f9 chore: fix trial log following logic (#1832) [DET-4850]
376619a chore: add missing apiKey update to internal streaming sdk (#1833)
6d906b3 chore: bump version: 0.13.13rc1 -> 0.13.13rc2
6419473 chore: bump version: 0.13.13rc0 -> 0.13.13rc1
60acd7d fix: dars_penntreebank_pytorch example [DET-4841] (#1822)
8dba87e fix: show zoom out tip when zoomed into learning curve chart (#1828)
9506c87 chore: Revert DET-4688, do not support single-trial experiments in trials-sample endpoint [DET-4840] (#1824)
14aa243 fix: update model def button to be a raw link (#1826)
6ce2882 chore: various elastic fixes (#1825) [DET-4839]
a69fc05 fix: update types to support new log levels (#1823)
2b1790c fix: revert broken user-facing change with experiment config logic (#1821)
9e6ca59 chore: bump version: 0.13.13.dev0 -> 0.13.13rc0
7fb3926 chore: lock api state for backward compatibility check
6c1840e chore: webui support elastic search trial logs [DET-4616] (#1801)
0dd6900 fix: requested stops shouldn't be treated as errored (#1818)
1bcda69 chore: Update bumpversion (#1820)
7808e6f chore: return timestamp and level per log from trial logs API [DET-4825] (#1814)
0faa677 docs: add doc for det-deploy aws list (#1813)
9dacff0 feat: enable learning curve [DET-4776, DET-4792] (#1796)
955177e fix: the examples that use the old Pytorch APIs (#1810)
b783761 chore: TF RNG in Estimators test would require a session [DET-4624] (#1811)
5296282 feat: det-deploy aws list (#1790)
cf40ffd chore: fix typos in release notes (#1812)
05210e0 chore: change API response fields from placeholders to empty strings (#1809)
9e687e5 fix: CI failing due to PR #1724 (#1807)
e3ab038 docs: release notes for 0.13.12 (#1782)
0a89c12 chore: remove tensorpack test [DET-4790] (#1793)
204d363 feat: add HGI table view [DET-4634 DET-4637] (#1778)
fbec0fd refactor: update default user filters to All when user is an admin (#1799)
bcc8ef0 ci: reduce webui e2e-tests flakes [DET-4973] (#1798)
20f5fe1 feat: add API endpoint that returns information about the resource pool (#1724)
77df93f docs: deprecate old PyTorch API (#1783)
6608389 fix: always encode trial ID as string [DET-4789] (#1800)
9307710 feat: add metric column to learning curve table (#1785)
1e2dc78 chore: update trials-sample endpoint to support a single trial experiments [DET-4688] (#1791)
4dbf58d chore: return consistent log IDs [DET-4789] (#1775)
4c1be3a docs: add docs for elasticsearch-backed trial log features (#1768)
bb0ab5e chore: deal gracefully with missing values in trials-sample [DET-4771] (#1787)
74868c6 chore: tune Fluent Bit logging performance [DET-4714] (#1643)
552bc1f docs: add instructions to get started with Determined locally to README and docs (#1747)
31c53ad style: update WebUI style (#1767)
4f60859 ci: build React in development mode for E2E tests (#1789)
7856533 fix: learning curve tuning [DET-4375, DET-4692, DET-4711] (#1753)
4c8af11 chore: remove IDE Setup how-to doc [DET-4769] (#1786)
e5928d9 test: add master logs test [DET-4684] (#1780)
4aab166 chore: Support changing default images in det-deploy on cloud [DET-4689] (#1729)
fd080e5 docs: edit YAML topic guide (#1734)
f7c99ce chore: put index template in integrations to avoid precision related race (#1776)
9005f14 fix: webui responsive table showing unneeded horizontal scrollbar [DET-4710] (#1760)
adf587d chore: resolve security packages [DET-4733] (#1773)
0bb5e51 refactor: update tests to be more reliable (#1779)
33e280c ci: remove Cypress WebUI tests [DET-4580, DET-4755] (#1777)
0cba2c5 docs: update task configurations [DET-4731] (#1762)
000a4ab ci: increase resource class for packaging steps (#1774)
3f82a2c feat: add hgi basic card view and slots bar [DET-4635 DET-4632] (#1717)
000293a fix: relax type expectations for hparam values [DET-4742 DET-4744] (#1763)
dff17ae feat: json-schema for experiment config validation (#1715)
6aaed77 fix: workaround boto3+minio bug (#1770)
805b29f docs: fix incorrect field name in docker registry creds docs (#1769)
f6107db fix: add endTime to metric workloads [DET-4743] (#1772)
b9ecc47 chore: fix up Go dependencies (#1766)
08a3e73 fix: use cookie token from sso if applicable (#1765)
c63891a chore: show more information if Fluent Bit exits (#1764)
7e170c0 chore: switch library used for connecting to PostgreSQL [DET-4592] (#1761)
97b929b chore: update caniuse dev package (#1756)
0015ebb chore: refactor & migrate experiment details and trials [DET-4020] (#1730)
ed55758 fix: check for a cookie token and verify auth with it [DET-4732] (#1758)
b86afac Release notes for 0.13.11 (#1759)
b5b41d2 fix: add boolean to accepted hparam types [DET-4727] (#1757)
a1fe795 fix: don't request GPUs for Fluent Bit container (#1755)
074eae5 feat: add searcher-specific InvalidHP logic [DET-4334, DET-4335] (#1698)

Docker images

  • docker pull determinedai/determined-master:0.13.13
  • docker pull determinedai/determined-master:d352a9ba
  • docker pull determinedai/determined-master:d352a9ba0d291a75263d10218b70b88132e78678
  • docker pull determinedai/determined-dev:determined-master-d352a9ba
  • docker pull determinedai/determined-dev:determined-master-d352a9ba0d291a75263d10218b70b88132e78678

0.13.12

12 Jan 04:51
Compare
Choose a tag to compare

Changelog

6f6280e chore: bump version: 0.13.12rc2 -> 0.13.12
335057a chore: bump version: 0.13.12rc1 -> 0.13.12rc2
690c725 docs: Minor grammatical change for 0.13.12 release notes.
27758a4 chore: bump version: 0.13.12rc0 -> 0.13.12rc1
dcc5949 docs: Release notes for 0.13.12.
3ce5a95 chore: bump version: 0.13.12.dev0 -> 0.13.12rc0
9ef6fc6 chore: bump version: 0.13.11 -> 0.13.12.dev0
a19eb5f fix: relax type expectations for hparam values [DET-4742 DET-4744] (#1763)
798fa4f fix: add endTime to metric workloads [DET-4743] (#1772)
783c810 fix: use cookie token from sso if applicable (#1765)

Docker images

  • docker pull determinedai/determined-master:0.13.12
  • docker pull determinedai/determined-master:6f6280e7
  • docker pull determinedai/determined-master:6f6280e7807996eebe64df4e3503d1b08fc63c57
  • docker pull determinedai/determined-dev:determined-master-6f6280e7
  • docker pull determinedai/determined-dev:determined-master-6f6280e7807996eebe64df4e3503d1b08fc63c57

0.13.11

07 Jan 18:03
Compare
Choose a tag to compare

Changelog

16860f3 chore: bump version: 0.13.11rc6 -> 0.13.11
68015b7 chore: bump version: 0.13.11rc5 -> 0.13.11rc6
8fdc4bc Revert "chore: add priority scheduling"
42631e4 chore: bump version: 0.13.11rc4 -> 0.13.11rc5
03e6406 chore: add priority scheduling
2c299b2 chore: bump version: 0.13.11rc3 -> 0.13.11rc4
2f7207c chore: bump version: 0.13.11rc2 -> 0.13.11rc3
de6b50d fix: check for a cookie token and verify auth with it [DET-4732] (#1758)
a743bf4 fix: add boolean to accepted hparam types [DET-4727] (#1757)
75cec07 Release notes for 0.13.11 (#1759)
81b5780 chore: bump version: 0.13.11rc1 -> 0.13.11rc2
8098ee4 fix: don't request GPUs for Fluent Bit container (#1755)
73d7c62 chore: bump version: 0.13.11rc0 -> 0.13.11rc1
cb24990 chore: bump version: 0.13.11.dev0 -> 0.13.11rc0
bf60ee4 chore: bump version: 0.13.10.dev0 -> 0.13.11.dev0
31bb660 chore: lock api state for backward compatibility check
2242c2c chore: add F1 score example of pytorch custom reducers [DET-4724] (#1752)
b9a9408 chore: hack tensorboard support to include custom metrics (#1750)
ddf8ee9 feat: support custom reducers for PyTorch (#1647)
c7fdddf feat: support agent label for "det-deploy local agent-up" [DET-4713] (#1748)
0e240a4 feat: learning curve [DET-4445] (#1731)
6a9e731 style: sort interface keys and type literals (#1699)
1c818d8 test: fix race condition in test-intg-agent (#1741)
bab2337 chore: fix mnist_data_layer convergence test (#1744)
ef59afa ci: split CircleCI tests more carefully (#1740)
1947408 feat: enable configuring trial logs backend on Kubernetes (#1737)
f8b89bc ci: upgrade version of CircleCI Helm orb (#1739)
d1ece88 feat: rebase onto horovod 0.21.0 [DET-4668] (#1720)
883bc5d fix: command priority not respected [DET-4674] (#1735)
3f87b24 test: use updated GKE version (#1736)
c68b32b docs: add topic guide for priority scheduler [DET-4670] (#1703)
15794fa chore: log through Fluent Bit on Kubernetes [DET-4622] (#1712)
bc52313 chore: migrate getInfo endpoint [DET-4406] (#1713)
8ccb40d fix: webui table horizontal scroll [DET-4660] (#1722)
f991a04 feat: support multiple backward call per train_batch in pytorch [DET-4667] (#1732)
ecc7853 build: allow parallel runs of js and css checks (#1727)
f5e58c3 chore: update to Go 1.15 (#1716)
ac52ec8 fix: fix asha max concurrent trials (#1719)
eb76a2c fix: fix missing field in function call from bad merge (#1726)
73799c5 test: integration test fluent with postgres and elasticsearch backends (#1705)
1a1f756 fix: accept None-type hyperparameters with --local (#1704)
01cde0f chore: webui MultiSelect storybook [DET-4040] (#1714)
5cc385a docs: clarify some Kubernetes-related docs (#1708)
374e747 fix: BERT SQuAD example works with latest stable transformers [DET-4680] (#1718)
8aa6cd3 feat: support order by in trial logs api [DET-4647] (#1706)
96ca862 chore: update command APIs to return resource pool info [DET-4568 DET-4569 DET-4570 DET-4571] (#1710)
d71dfc4 style: fix mobile steps table [DET-4669] (#1701)
491b062 chore: provide telemetry information in new api [DET-4642] (#1672)
2faade7 chore: priority scheduler unit tests [DET-4513] (#1658)
cc491f3 chore: migrate trial details endpoint [DET-4021] (#1674)
b3d70e2 chore: set MinCapacity for RDS for secure det-deploy to 2 (#1535)
7127d26 chore: Updates to 0.13.10.dev0. (#1702)
aa1389e feat: hp viz skeleton [DET-4494, DET-4495, DET-4545] (#1618)
8b35c38 test: integration test elastic-backend trial logs APIs (#1675)
d5a913d chore: restart fluentbit on failures [DET-4665] (#1696)
194af45 chore: add resource pools mock api [DET-4639] (#1662)
b1c2930 chore: add dev hgi cluster page and stat overview [DET-4633] (#1676)
7ed21d4 style: correct mobile viewport [DET-4664] (#1695)
d9add13 feat: port of DETR (#1470)
39b6d12 chore: minor copy update to trial log datetime filters (#1692)
f0817a8 chore: hide the trial logs filters when there are no filter options (#1690)
f013ea3 chore: camelcase required api attr names [DET-4648] (#1681)
c4c6694 chore: fix a shadow var declaration (#1691)
4b0cf95 fix: prevent mobile tabbar from opening new window for Master Logs [DET-4654] (#1688)
72cbe09 docs: document setting priorities in experiment config (#1687)
919d9cf chore: sort trial log's filter options (#1686)
a29ccd6 fix: add abort controller to trial log endpoints (#1689)
db1b7ca chore: move dev dependencies out of dependencies (#1679)
5c69af3 fix: add K8s disclaimer for mmdetection example (#1683)
27a67b6 ci: fix windows cli test (#1685)

Docker images

  • docker pull determinedai/determined-master:0.13.11
  • docker pull determinedai/determined-master:16860f3f
  • docker pull determinedai/determined-master:16860f3fd2495af53913a9a62d7330898004b671
  • docker pull determinedai/determined-dev:determined-master-16860f3f
  • docker pull determinedai/determined-dev:determined-master-16860f3fd2495af53913a9a62d7330898004b671

0.13.10

11 Dec 01:03
Compare
Choose a tag to compare

Changelog

93b9369 chore: bump version: 0.13.10rc6 -> 0.13.10
7a6d5e2 Revert "chore: bump version: 0.13.10rc6 -> 0.13.10"
df0a42d docs: Release notes for 0.13.10. (#1693)
b5f9c60 chore: bump version: 0.13.10rc6 -> 0.13.10
898aa79 chore: lock api state for backward compatibility check
bc522c7 chore: bump version: 0.13.10rc5 -> 0.13.10rc6
6e6fa1c chore: restart fluentbit on failures [DET-4665] (#1696)
df00694 style: correct mobile viewport [DET-4664] (#1695)
237cf19 chore: bump version: 0.13.10rc4 -> 0.13.10rc5
9bc6a9b chore: fix a shadow var declaration (#1691)
e8035b5 chore: bump version: 0.13.10rc3 -> 0.13.10rc4
434ed71 chore: hide the trial logs filters when there are no filter options (#1690)
ba1ac13 fix: prevent mobile tabbar from opening new window for Master Logs [DET-4654] (#1688)
fe052e4 chore: sort trial log's filter options (#1686)
a687ec5 docs: document setting priorities in experiment config (#1687)
094f3dc chore: bump version: 0.13.10rc2 -> 0.13.10rc3
554706d fix: add abort controller to trial log endpoints (#1689)
0a5ed77 chore: bump version: 0.13.10rc1 -> 0.13.10rc2
9f67d2c ci: fix windows cli test (#1685)
a3b830c chore: bump version: 0.13.10rc0 -> 0.13.10rc1
099bfe4 fix: add K8s disclaimer for mmdetection example (#1683)
f7cfdca chore: bump version: 0.13.10.dev0 -> 0.13.10rc0
25092a3 chore: bump version: 0.13.9.dev0 -> 0.13.10.dev0
99a848f feat: support configuring priority scheduler in det-deploy [DET-4508] (#1682)
9054865 feat: read logs from elasticsearch [DET-4621] (#1637)
7da1266 fix: let shells work through an HTTP proxy/load balancer [DET-4469] (#1677)
c4b2df0 feat: expose some Fluent Bit logs in the agent (#1680)
7106bef fix: flush searcher events more often (#1673) [DET-4644]
cc351ef fix: template merging for bind mounts [DET-4630] (#1678)
63e13a4 feat: webui trial logs improvements [DET-4228, DET-4480, DET-4481, DET-4482, DET-4483, DET-4594] (#1650)
f4a3afd feat: display task priorities in CLI [DET-4515, DET-4516] (#1639)
7ed3d9c feat: make priority scheduler configurable and add docs [DET-4641] (#1667)
33a1885 fix: limit concurrent restores to avoid resource exhaustion [DET-4556] (#1666)
a2a807e feat: allow task requests to receive a resource pool field [DET-4342] (#1600)
912c375 chore: Improvements to metrics streaming endpoints [DET-4532] (#1664)
85fee6a docs: update k8s limitations [DET-4582] (#1671)
6bc6bd6 docs: reorganize the documentation contents (#1663)
155a08b fix: allow dots in map config keys (#1665)
64550ef chore: Work around possible bug when PyTorch opens checkpoint files [DET-4614] (#1661)
3c276f0 ci: add taiko video plugin (#1632)
f1dbf70 chore: migrate fork/continue experiment endpoint [DET-4023] (#1659)
ab6bdfb test: integration test trial logs API (#1456)
6e0dae1 chore: Ensuring examples always call .contiguous() before .view() [DET-4613] (#1653)
d05e222 feat: priority scheduler with preemption [DET-4512] (#1634)
2e5f4af chore: migrate wait page to react [DET-4521] (#1559)
7aecb05 test: add responsive navbar test [DET-4603] (#1656)
4ce4623 test: avoid swallowing errors in taiko (#1657)
466a226 test: reduce state sharing between e2e tests [DET-4487] (#1543)
764e969 build: set server address for react preview build. (#1654)
d7fb07b feat: allow default password in kubernetes [DET-4435] (#1624)
80187b9 feat: support elasticsearch as a trial logging backend [DET-4179] (#1542)
0929c8a refactor: update filters to dynamically collapse when needed [DET-4584] (#1629)
8f011f3 fix: don't replay duplicates in master restart [DET-4525, DET-4599] (#1655)
24428f6 feat: add telemetry for schedulers [DET-4517] (#1645)
ef4e9ae chore: extend on complete hook for overflow actions (#1651)
93007ed chore: webui migrate ActiveExperiemnt context polling to new API [DET-4068] (#1560)
d94824e refactor: add abort api calls [DET-4585] (#1630)
e7321ee docs: fix rest-api side menu links [DET-4598] (#1640)
0322807 ci: run some E2E AWS CI tests with the master using TLS [DET-4606] (#1529)
78ec837 feat: support validation_steps in configure_fit() [DET-4529] (#1649)
68b7b08 feat: support max zero slot containers for resource pools [DET-4309,DET-4340] (#1507)
7cbd5da docs: fixes for EKS docs. (#1636)
dd89200 fix: limit --local --test mode to 1 gpu [DET-4602] (#1648)
f0eba7d chore: lock api state for backward compatibility check (#1644)
c26de48 chore: bump version: 0.13.8.dev0 -> 0.13.9.dev0 (#1641)
1ad4042 chore: fix fluent lua filter (#1642)
00d38b9 fix: fix wait page url for notebook launch request [DET-4586] (#1631)
fe901d5 fix: track best validation metric for darts_cnn hp search benchmark (#1638)
1d58515 feat: add documentation for AWS custom tags (#1621)
4dcea3e chore: Checking for more instances of EagerTensors [DET-4566] (#1633)
a77657d fix: unets_tf_keras example data download (#1635)
1210b77 feat: return resource_pool in agent GET APIs [DET-4567] (#1616)
422b900 chore: Adding NVIDIA Tesla A100 details for GCE [DET-4583] (#1628)
e3c240a style: responsive webui [DET-4417, DET-4420] (#1501)
2166ad7 test: fix CircleCI timing-based splitting for E2E tests (#1622)
be1ed46 chore: fix keras validation for dtrain (#1626)
8384e5c chore: increase timeout for metrics stream tests that are flaky on CI infra [DET-4581] (#1627)
fb86450 ci: add gauge taiko [DET-4576] (#1619)
4855079 feat: Add custom tags to AWSClusterConfig (#1561)
2e2c121 fix: clean up data handling in TFKerasTrial (#1564)
74d4196 feat: support per command shmSize settings [DET-4577] (#1620)
4601794 fix: tensorboard to load from experiment list via table batch (#1617)
4e7e738 ci: update task and authentication tests to be more reliable (#1614)
b323c24 Fix typo (#1611)
bff6f27 feat: ALBERT on SQuAD 2.0 example (#1609)
d63897f chore: include offset in trial log IDs returned to webui [DET-4561] (#1608)
e1e514a fix: webui unarchive button loading state [DET-4017] (#1594)
7509e1a chore: webui migrate killTask to new API [DET-4019] (#1589)
8d1eb2d docs: Release notes for 0.13.8. (#1603)
4fbb6e8 fix: honor DET_MASTER_CERT_NAME with shells (#1604)
43f4901 chore: update CODEOWNERS to be opt-in (#1595)
eb45fee chore: add endpoint to stream trial log fields [DET-4479] (#1537)
769a5d8 chore: bumpenvs (#1599)
55c6ccb fix: allow clients to override the expected master cert address [DET-4547] (#1588)
da38010 feat: change the priority scheduler to round robin scheduler [DET-4514] (#1596)
6a37a5f chore: bumpenvs for NCCL update (#1593)
ec9abc3 feat: implement API endpoint for sampling streams of metrics from the best trials [DET-4441] (#1571)
d8f536e feat: InvalidHP Searcher Ability [DET-4333] (#1550)
8b6ab79 feat: propagate task priorities to resource pools [DET-4510] (#1577)
e8a6784 chore: Tag metrics streaming APIs as Internal due to their less-stable or less-supported status [DET-4546] (#1592)
175f212 fix: issue data type warnings in to_device (#1591)
5e9b6d3 fix: tensorboard with absolute storage_path (#1590)
aedab51 chore: webui migrate getAgents to new API [DET-3844] (#1576)
1bb1990 feat: add configurations for priority scheduler [DET-4507, DET-4509] (#1565)

Docker images

  • docker pull determinedai/determined-master:0.13.10
  • docker pull determinedai/determined-master:93b93697
  • docker pull determinedai/determined-master:93b9369791d9278be68e720ab65c5328d3fed5b9
  • docker pull determinedai/determined-dev:determined-master-93b93697
  • docker pull determinedai/determined-dev:determined-master-93b9369791d9278be68e720ab65c5328d3fed5b9

0.13.9

20 Nov 21:13
Compare
Choose a tag to compare

Changelog

3f0ec0c chore: bump version: 0.13.9rc4 -> 0.13.9
149a461 docs: Release notes for 0.13.9 (#1623)
06f56dd chore: bump version: 0.13.9rc3 -> 0.13.9rc4
0b35cea chore: bump version: 0.13.9rc2 -> 0.13.9rc3
bb8be81 docs: More changes for release notes for 0.13.9.
cbcc7b1 feat: support per command shmSize settings [DET-4577] (#1620)
5490862 chore: bump version: 0.13.9rc1 -> 0.13.9rc2
d7b20a3 fix: tensorboard to load from experiment list via table batch (#1617)
a76e0cf chore: bump version: 0.13.9rc0 -> 0.13.9rc1
d1992c3 chore: bump version: 0.13.9.dev0 -> 0.13.9rc0
fa24de5 docs: Release notes for 0.13.9.
c96d738 chore: include offset in trial log IDs returned to webui [DET-4561] (#1608)
7fac747 chore: bump version: 0.13.8 -> 0.13.9.dev0
5264be3 docs: Release notes for 0.13.8. (#1603)

Docker images

  • docker pull determinedai/determined-master:0.13.9
  • docker pull determinedai/determined-master:3f0ec0ce
  • docker pull determinedai/determined-master:3f0ec0ce8d9dbe4256a630156480661f3a8c2ff1
  • docker pull determinedai/determined-dev:determined-master-3f0ec0ce
  • docker pull determinedai/determined-dev:determined-master-3f0ec0ce8d9dbe4256a630156480661f3a8c2ff1

0.13.8

18 Nov 00:01
Compare
Choose a tag to compare

Changelog

e129534 chore: bump version: 0.13.8rc4 -> 0.13.8
29b7568 chore: bump version: 0.13.8rc3 -> 0.13.8rc4
0298ec4 chore: bump version: 0.13.8rc2 -> 0.13.8rc3
ce59a44 chore: bump version: 0.13.8rc1 -> 0.13.8rc2
b9c36f1 fix: honor DET_MASTER_CERT_NAME with shells (#1604)
d40f50f chore: bump version: 0.13.8rc0 -> 0.13.8rc1
27439a5 chore: bumpenvs (#1599)
9aedc57 fix: allow clients to override the expected master cert address [DET-4547] (#1588)
66fa2ce chore: bumpenvs for NCCL update (#1593)
0bb6bcb chore: Tag metrics streaming APIs as Internal due to their less-stable or less-supported status [DET-4546] (#1592)
d15d9bc chore: bump version: 0.13.8.dev0 -> 0.13.8rc0
9f8f306 chore: fix old checkpoint export apis [DET-4538] (#1582)
9b5dcb4 feat: support trial log filtering in CLI [DET-4489] (#1429)
38c2c8f docs: Document which network ports PEDL uses for inter-agent communic… (#1464)
ac1a57c chore: webui migrate patchExperiment to new API [DET-4017] (#1557)
69c3ae7 docs: add rest api to the references toc [DET-4381] (#1586)
33aed96 fix: fix sign in page link to docs (#1587)
be27ffe fix: restore keras TensorBoard wrapper (#1580)
c0a3603 chore: unets_tf_keras README note (#1581)
b1c8c1c fix: check for error when converting k8s watch objects (#1572)
d18b187 chore: add current page to trial logs breadcrumb [DET-4530] (#1569)
6442e38 chore: server-side hashing for password change [DET-4534] (#1574)
743ed58 fix: fix wait page asset paths [DET-4496 DET-4497] (#1551)
3e387e0 test: use synthetic data for gpu detection test (#1573)
a850b04 fix: avoid hangs during validation for TFKeras [DET-4434] (#1555)
71c9845 fix: fix experiment archive action going out of sync [DET-4535] (#1575)
899913e chore: add a link to release notes in update notification [DET-4097] (#1552)
95d257d fix: fix ctr-click failing to open experiment rows [DET-4531] (#1570)
911355a feat: support configure_fit() in TFKerasTrial (#1566)
43a18a7 docs: clean up API docs [DET-4399] (#1568)
d6ea9f1 feat: add streaming endpoint for trial metrics at a specific point [DET-4440] (#1562)
e8a5a14 feat: add navigation sidebar and breadcrumbs to log views [DET-4394] (#1546)
2b96d77 fix: use agent's master location overrides for Fluent Bit (#1567)
df51683 chore: fix reporting for verbose=1 (#1563)
5b73331 feat: update trial route [DET-4402] (#1549)
bcc0ea8 add a test to disable and enable slots. (#1548)
ce2db3e fix: webui label select misalignment [DET-4416] (#1554)
0dba3de feat: add streaming endpoints with metric metadata for future UI work [DET-4439] (#1538)
765275a feat: add support for TFKeras models that subclass tf.keras.Model [DET-4393, DET-4103, DET-3257, DET-3217] (#1495)
5bdc0e3 build: add a pre-release script [DET-4414] (#1505)
85d6e68 chore: correct the percentage reporting for verbose=1 (#1558)
9567e8e fix: make trial actor still handle ContainerLog messages (#1553)
264c8fb feat: use Fluent Bit for trial logs [DET-4178] (#1462)
108a0e1 fix: fix disabling slots [DET-4492] (#1547)
47f3572 fix: fix not being able to find resource pool [DET-4477] (#1544)
7f57be6 fix: keras callbacks [DET-4299] [DET-4202] (#1458)
e5aeb58 style: add alert icon [DET-4430] (#1532)
58ba70f feat: make slot type configurable. [DET-4308] (#1484)
3e23aed chore: remove protonet_omniglot_pytorch nightly test (#1541)
e4a1376 test: avoid clearing auth token in e2e tests [DET-4486] (#1539)
14ebb55 feat: portable webui builds and pr preview [DET-4324] (#1422)
a152663 chore: bump lib pq (#1533)
fa2d592 chore: lower threshold for some convergence tests (#1530)
5566a3b chore: webui make page/tab title more descriptive [DET-2151] (#1486)
4d66329 fix: make pip happy again (#1534)
6958dc6 feat: support filtering in trial logs api [DET-4177] (#1427)
9f7ed8f chore: fix master and agent release targets (#1531)
590d2dd chore: webui migrate killExperiment to new API [DET-4016] (#1521)
6c7fddd feat: allow tensorboard to run startup hooks [DET-4187] (#1463)
1a979cc feat: allow det-deploy aws to specify subnet for simple deployment type (#1515)
5e8a023 chore: bump version: 0.13.7.dev0 -> 0.13.8.dev0
d655dad docs: Release notes for 0.13.7. (#1526)
2bab88a chore: add new fields to trial logs [DET-4176] (#1373)
7fb09b4 docs: add FAQ about TF2 (#1523)
279c5c6 chore: make gen-attributions.py play nicer (#1525)
37fc88f refactor: clean up code for master-sent messages (#1520)
d06dbea feat: make DB service type configurable in Helm chart (#1522)
c44c9f7 chore: fix e2e convergence tests (#1524)
80b47b5 docs: Add TLS disclaimer (#1519)
4d4b1e8 fix: saving & restoring RNG state for Keras & Estimators [DET-3743] (#1492)
711cb69 fix: add missing actor startup for create experiment (#1517)
8d246c6 test: add api pagination refactor and test [DET-4425] (#1504)
09a07f4 chore: sync agent go checksum file (#1516)
4c84784 fix: avoid showing tooltip when hovering outside of the nav bar [DET-4284 (#1461)
5c74a6c fix: incorrect directory path for mmdetection tests (#1514)

Docker images

  • docker pull determinedai/determined-master:0.13.8
  • docker pull determinedai/determined-master:e1295346
  • docker pull determinedai/determined-master:e12953467c9de3d9289c8aca882d0993d89c23dd
  • docker pull determinedai/determined-dev:determined-master-e1295346
  • docker pull determinedai/determined-dev:determined-master-e12953467c9de3d9289c8aca882d0993d89c23dd

0.13.7

30 Oct 16:33
Compare
Choose a tag to compare

Changelog

18ce58d chore: bump version: 0.13.7rc5 -> 0.13.7
193122d chore: fix master and agent release targets (#1531)
87cf286 chore: sync agent go checksum file (#1516)
678a0ec docs: Release notes for 0.13.7. (#1526)
5b76a94 chore: bump version: 0.13.7rc4 -> 0.13.7rc5
3eb5442 chore: make gen-attributions.py play nicer (#1525)
21a9754 chore: bump version: 0.13.7rc3 -> 0.13.7rc4
411e682 docs: Add TLS disclaimer (#1519)
19a113d fix: add missing actor startup for create experiment (#1517)
b7f7726 chore: bump version: 0.13.7rc2 -> 0.13.7rc3
e9dcbe4 chore: bump version: 0.13.7rc1 -> 0.13.7rc2
64e249f chore: bump version: 0.13.7rc0 -> 0.13.7rc1
64c75f2 chore: bump version: 0.13.7.dev0 -> 0.13.7rc0
fb50977 fix: webui experiment list goes to first page when changing filters [DET-4377] (#1508)
bce4fb6 fix: update postgres and perl version [DET-4395] (#1491)
137b806 fix: fix to allow det-deploy to upgrade existing clusters [DET-4427] (#1511)
f81fbe0 feat: make spot instances available in det-deploy [DET-4339] (#1494)
c03afa4 docs: spot instances docs and fixes [DET-4338] (#1487)
c3307db fix: disable deletion protection on Cloud SQL instances [DET-4428] (#1512)
a5ddcae test: disable cypress wait check [DET-4423] (#1503)
bee5c7a fix branch name cannot contain slash for ci (#1506)
a02f196 feat: add cluster name to master config [DET-3953] (#1474)
345c630 docs: minor updates to k8s docs (#1509)
11fce70 chore: autogenerate attributions files [DET-4433] (#1493)
8ecebed docs: remove outdated k8 limitations (#1475)
9bed2d5 test: add unit tests for agent resource manager [DET-4134] (#1477)
9b7a9a7 feat: support mmdetection library in Determined (#1438)
6e7a77d chore: remove protonet_omniglot_pytorch from nightly (#1510)
07ac185 feat: improved metric selector [DET-4122] (#1421)
e7802db fix: fix api pagination index out of bound error (#1499)
87aa6fd docs: deprecate SHA notice [DET-4336] (#1500)
03e0a92 chore: basic create experiment endpoint [DET-4386] (#1455)
a1fe3dd chore: propagate horovod worker process retcode (#1496)
254766e docs: add remove steps topic guide [DET-3876] (#1485)
d13ec69 chore: fixing Huggingface example and nightly tests (#1497)
5e366ff chore: webui stop polling on experiment terminal state [DET-4305] (#1467)
335e8a4 fix: webui remove checkpoint button for deleted checkpoint [DET-4286] (#1466)
8baef3d chore: make nightly tests parallel (#1490)
74f2c94 chore: split /master endpoint [DET-4408] (#1483)
6f32498 update time for convergence tests (#1489)
2a7ba13 chore: migrate webui get current user to new api [DET-4015] (#1459)
1cccde1 chore: fix nightly test failures [DET-4397] (#1479)
7711ef9 fix: fix handling of boolean hyperparameters in trial view [DET-4412] (#1480)
bd229fb fix: preserve rank_id in logs [DET-4413] (#1468)
0093954 chore: update buf image (#1478)
04feba5 feat: add routing logic for multiple resource pools [DET-4132, DET-4302] (#1398)
64a0f8d feat: add shell, command, and notebook launch APIs [DET-4094] (#1454)
f090caf fix: fix setting default for fit field in master.yaml (#1469)
4c11398 fix: fix a parameter typo in ci (#1473)
c7e7e42 fix: fix missing swagger definitions [DET-4213] (#1437)
22cdc82 ci: show python environment via pip freeze (#1472)
29f6284 feat: webui add labels filter for experiments [DET-4117] (#1465)
f188a1d chore: webui improve float values vizualization [DET-4127] (#1432)
4832e4c chore: set enable_cors for test aws cluster [DET-4326] (#1439)
3668ea5 chore: add tests for cheap examples (#1430)
17c64da fix: rst formatting issue (#1460)
2630f74 docs: update CONTRIBUTING.md to point to new locations of examples [D… (#1457)
de2f55d fix: specify numpy version in requirements.txt (#1408)
b5399b2 feat: support AWS spot instances [DET-4191] (#1415)
c97dc52 fix: improve performance of agents endpoint for k8s [DET-4073] (#1450)
4e89b90 fix: set up swagger static deploy (#1449)
a6d5c3c build: add buf breaking change detection (#1442)
3c23308 fix: change eks cluster setup docs to pass fmt check (#1453)
1a0954a chore: react dependency update [DET-4379] (#1447)
267fca5 chore: add common launch params to tensorboard API [DET-4214] (#1436)
0c83af3 chore: add EKS cluster setup documentation [DET-4028] (#1425)
2f2d38e style: add enforcement of reST formatting (#1399)
0a9c33a fix: show tensorboard sources for CLI deployed tensorboards [DET-4372] (#1441)
321b263 fix: style fix for long checkpoint names and minor copy change for task batch modal confirmation [DET-4376] (#1446)
05634df fix: instruct protoc to generate camelCase names. (#1448)
f8440c1 fix: update task names for new trials [DET-4832] (#1451)
7f0b097 chore: bump version: 0.13.6.dev0 -> 0.13.7.dev0
3e0e299 docs: Clear out release note candidates.
b945a27 docs: release notes for 0.13.6. (#1444)
415acf8 fix: update model registry link to REST API docs (#1445)
2cfc166 fix: show more help text and version info in det-deploy [DET-4373] (#1443)
9d9f2c2 chore: remove useless support_determined_native calls (#1440)
3763bbd chore: remove tests for native parallel (#1435)

Docker images

  • docker pull determinedai/determined-master:0.13.7
  • docker pull determinedai/determined-master:18ce58d8
  • docker pull determinedai/determined-master:18ce58d82ddeff7f03bb47ac43e8635aad7a691c
  • docker pull determinedai/determined-dev:determined-master-18ce58d8
  • docker pull determinedai/determined-dev:determined-master-18ce58d82ddeff7f03bb47ac43e8635aad7a691c

0.13.6

14 Oct 23:22
Compare
Choose a tag to compare

Changelog

35cd77a chore: bump version: 0.13.6rc1 -> 0.13.6
84c3513 chore: bump version: 0.13.6rc0 -> 0.13.6rc1
7b3ade0 docs: Clear out release note candidates.
186e348 fix: show more help text and version info in det-deploy [DET-4373] (#1443)
4b1cdd0 docs: release notes for 0.13.6. (#1444)
88c9cc6 chore: bump version: 0.13.6.dev0 -> 0.13.6rc0
68bdba5 fix: correct humanReadableFloat error on Experiment Detail page [DET-4354] (#1431)
de4ec0a feat: add opentracing to actor system [DET-4212] (#1327)
c9bb9b2 docs: add docs for TLS usage and configuration [DET-4364] (#1419)
f6e36fe feat: support storageClass configuration in Helm Chart [DET-4357] (#1434)
5f34b58 fix: webui metric chart not displaying log scale properly [DET-4246] (#1418)
588cb70 chore: add telemetry for k8s vs. agents [DET-4234] (#1411)
689b06f chore: update horovod version (#1413)
e145d89 make AWS and GCP agent image optional (#1417)
c608fd1 ci: update gke version (#1424)
f0289b5 fix: webui preserve colors on metrics chart [DET-4247] (#1400)
70d3ae1 fix: make webui chart legend transparent to show data behind [DET-4218] (#1420)
a6cb9be docs: update to new version of custom Sphinx theme (#1414)
153131f fix: don't fail master if restoring non-terminal exp from DB [DET-4074] (#1397)
69afa95 chore: webui add experiment label editing on experiment detail page [DET-3972] (#1356)
e23d113 chore: make make -C proto build idempotent (#1390)
bec697f fix: support for --local-state-path with det-deploy gcp [DET-4277] (#1402)
eb07743 fix doc for agent starting period and idle timeout (#1416)
bfcb874 feat: support configuring CPU and Mem reqs for DB in helm chart [DET-4032] (#1412)
2564555 chore: handle failed build in update bumpenvs script (#1395)
92de5cd chore: remove mixed mode TLS workaround from the agent (#1410)
dbce507 fix: always load default system TLS certificates in the harness (#1409)
1dd949f chore: add a formatter for protobufs (#1405)
2bd7e3a fix: webui trial info checkpoint size label update [DET-4250] (#1401)
a867df3 fix: use the correct target cancel state for canceling experiment [DET-4257] (#1404)
ed51d5a docs: bundle static swagger-ui with docs [DET-4210] (#1376)
53d974e chore: fix dependency issue for windows tests (#1406)
6c9e941 chore: add swagger authentication spec [DET-4272] (#1396)
4450b45 chore: include workloads in trial endpoint [DET-4036] (#1342)
8c8037f chore: add post experiment swagger spec (#1363)
46643b2 chore: rebase onto horovod 0.20.0 [DET-4225] (#1388)
9028be2 chore: bump version: 0.13.5.dev0 -> 0.13.6.dev0
3361c3b docs: get rid of staged release notes for 0.13.5, in preparation for the next release.
876833d docs: Release notes for 0.13.5. (#1392)
6701cd7 chore: set default startup in det-deploy to 20m (#1394)
c20deab chore: bump tf test versions (#1378)
5bf7eea chore: introduce resource pool and resource manager [DET-4131,DET-4136] (#1365)
c767f02 ci: update from deprecated remote docker versions [DET-4262] (#1393)
89a5906 fix: update agents context polling to block before next poll [DET-4264] (#1385)
5d2d4bc fix: det-deploy deprovisions GCP agents despite long master names [DET-4271] (#1391)
198d64e fix: experiment chart legend labelling line as 'trace 0' (#1389)
1926250 feat: increase max disconnected and idle period [DET-4267] (#1386)
38b0c95 fix: commands (TensorBoards, notebooks, etc.) should not be preempted [DET-4157] (#1346)
20a7bc7 docs: fix broken links (#1387)
68f3568 docs: add a tf.layers-in-Estimator example (#1383)
4d0ba2f feat: don't log through agent 0 [DET-4180] (#1344)
f1ff54e chore: fix typo in a docstring (#1384)
915fb50 fix: update percent utility to handle out of range numbers (#1381)
c8ee63e chore: fix possible syntax error when parsing experiment labels request [DET-4265] (#1382)
346bcc4 chore: fix typo in helm chart (#1379)
8d64cf9 fix: don't accept stale socket connections [DET-4203] (#1367)
5f4c490 fix: webui tweak select for better layout [DET-4123] (#1351)
693098e fix: webui trial chart render metrics with same name [DET-4169] (#1350)

Docker images

  • docker pull determinedai/determined-master:0.13.6
  • docker pull determinedai/determined-master:35cd77a2
  • docker pull determinedai/determined-master:35cd77a202dfe084a5c9655e8291f14c1a1c14a8
  • docker pull determinedai/determined-dev:determined-master-35cd77a2
  • docker pull determinedai/determined-dev:determined-master-35cd77a202dfe084a5c9655e8291f14c1a1c14a8

0.13.5

01 Oct 21:48
Compare
Choose a tag to compare

Changelog

91e7015 chore: bump version: 0.13.5rc2 -> 0.13.5
e4c378f docs: Release notes for 0.13.5. (#1392)
6c4316f chore: bump version: 0.13.5rc1 -> 0.13.5rc2
17bbfb3 fix: update agents context polling to block before next poll [DET-4264] (#1385)
60cc59c fix: det-deploy deprovisions GCP agents despite long master names [DET-4271] (#1391)
dda3a61 fix: experiment chart legend labelling line as 'trace 0' (#1389)
8b263ca feat: increase max disconnected and idle period [DET-4267] (#1386)
681e3f5 docs: fix broken links (#1387)
d2963e9 docs: add a tf.layers-in-Estimator example (#1383)
70081fe chore: bump version: 0.13.5rc0 -> 0.13.5rc1
44729d6 fix: update percent utility to handle out of range numbers (#1381)
864a03a chore: fix possible syntax error when parsing experiment labels request [DET-4265] (#1382)
12e5da5 chore: fix typo in helm chart (#1379)
9ab8c69 fix: don't accept stale socket connections [DET-4203] (#1367)
504936c fix: webui tweak select for better layout [DET-4123] (#1351)
5d2268f fix: webui trial chart render metrics with same name [DET-4169] (#1350)
1d90b95 chore: bump version: 0.13.5.dev0 -> 0.13.5rc0
9cb1082 chore: bumpenvs (#1377)
b3dcd3f feat: support new AWS regions [DET-3837] (#1297)
6ea8977 chore: replicate client side hash server side (#1372)
22cb5ee chore: update webui test tools (#1368)
2e8ba3e fix: send credentials for streaming endpoints in dev environment [DET-4240] (#1359)
365359a chore: revert old login [DET-4242] (#1370)
3edc110 feat: allow yogadl to connect to the master over TLS (#1369)
8df0554 chore: unit tests for kubernetes pod actor [DET-4107] (#1353)
4fa71b0 fix: check k8 message length before parsing [DET-4241] (#1360)
8a28e82 docs: specify that helm 3 is required (#1371)
cac219f test: fix broken k8 test (#1375)
06a342d feat: support initContainers and sidecar containers for k8s [DET-4026] (#1335)
df799c6 fix: update test cluster port config for e2e-tests (#1366)
cf9b717 chore: simplify provisioner and scheduler protocol (#1352)
5514328 chore: make Go tests and Python installs quieter (#1364)
cc42045 fix: update model registry to new api (#1361)
68e75c1 chore: Reduce default RDS capacity and enable auto pause. [DET-4220] [DET-4219] (#1341)
71affdd chore: add /api/v1 alias for create experiment endpoint [DET-4162] (#1321)
3834b2a chore: print error for non-release helm install (#1349)
a4269b3 chore: add basic create tensorboard endpoint [DET-4095] (#1331)
a915388 test: update nightly test paths (#1355)
5ace732 chore: fixes for example configuration files (#1358)
2e429f3 fix: fail correctly on preclose checkpoint failures [DET-4189] (#1323)
41b8da0 Add integration tests for det-deploy local (#1326)
bd06f35 chore: add API call to list all defined labels for experiments [DET-4099] (#1337)
015903f feat: restructure examples and add READMEs (#1301)
b09d436 chore: allow dying workers to exit in peace (#1347)
321b856 chore: quiet grpc log levels down (#1343)
35e8100 feat: expose _TrainContext.from_config() (#1336)
4f51c98 feat: introduce logStreamActor & notebook logs endpoint (#1248)
e669ab4 chore: format docs rst files (#1345)
9d3525c chore: remove tp tests (#1334)
6b978d3 chore: remove zmq patches in keras [DET-2708] (#1340)
4320966 remove maxSlotsPerPod hardcoding from values.yaml (#1339)
bf20bf0 docs: ensure canonical URLs are set for generated doc pages (#1338)
9bae553 chore: small refactor of Helm chart (#1330)
84e67eb chore: provide best checkpoint, latest, and best validations for trial endpoints (#1315)
9a550f0 docs: Revise README and docs landing page (#1325)
c522b51 feat: support configuring TLS via Helm chart [DET-4030] (#1310)
c12c401 test: remove nightly tensorpack tests (#1328)
75e0fb0 feat: allow the master to do everything over just one port (#1320)
b6c4ff0 feat: allow dynamic agents to use TLS to connect to the master (#1313)
0a4dcc7 fix: don't unset allReady for trials [DET-4197] (#1322)
7ef772b docs: fix phrasing from Tensorpack removal (#1324)
1c8328d fix: Do not perform worker health checks during termination [DET-4175] (#1318)
8212f58 feat: "det-deploy gcp down" deprovisions dynamic agents [DET-4155] (#1314)
de3ee14 chore: remove support for TensorpackTrial [DET-4181] (#1319)
49e60d0 chore: remove dependency on React Plotly [DET-3724] (#1287)
7f1f13a chore: add helper to register functions as actors (#1302)
36a0bcf refactor: use async polling [DET-4182] (#1317)
ff5e92d refactor: prevent rapid polling calls from making multiple polling timers [DET-4183] (#1316)
97d5c37 chore: reduce scheduler complexity [DET-4082] (#1237)
41fd9b3 chore: move workload out of searcher (#1303)
f527f0c fix: fix typo in Helm chart [DET-3781] (#1312)
d234439 chore: sunset elm [DET-3907] (#1164)
2a39680 chore: bump version: 0.13.4.dev0 -> 0.13.5.dev0 (#1311)
43af93e chore: create storybook stories for Navigation and UserSelectFilter components [DET-4013] [DET-4041] (#1240)
971e0d1 chore: webui react storybook for StateSelectFilter component [DET-4039] (#1252)
b02935e docs: Release notes for 0.13.4. (#1309)
f7ee07a docs: REST API docs fixes (#1308)
9046f1b fix: remove tensorboard source column from task list [DET-4173] (#1306)
8353525 fix: update command logs response expectation [DET-4167] (#1304)
44c68ee feat: lazily launch TensorBoards [DET-4156] (#1293)
03e1183 fix: deselect selected rows when table batch actions are done [DET-4128] (#1299)

Docker images

  • docker pull determinedai/determined-master:0.13.5
  • docker pull determinedai/determined-master:91e70159
  • docker pull determinedai/determined-master:91e70159db41fb52dd677db5713f6edeb12a0430
  • docker pull determinedai/determined-dev:determined-master-91e70159
  • docker pull determinedai/determined-dev:determined-master-91e70159db41fb52dd677db5713f6edeb12a0430