- abort find_alignment on empty input (#1090)
- Fix truncated words list when the replacement character is decoded (#1089)
- fix github language stats getting dominated by jupyter notebook (#1076)
- Fix alignment between the segments and the list of words (#1087)
- Use tiktoken (#1044)
- kwargs in decode() for convenience (#1061)
- fix all_tokens handling that caused more repetitions and discrepancy in JSON (#1060)
- fix typo in CHANGELOG.md
- Fix the repetition/hallucination issue identified in #1046 (#1052)
- Use triton==2.0.0 (#1053)
- Install triton in x86_64 linux only (#1051)
- update setup.py to specify python >= 3.8 requirement
- remove auxiliary audio extension (#1021)
- apply formatting with
black
,isort
, andflake8
(#1038) - word-level timestamps in
transcribe()
(#869) - Decoding improvements (#1033)
- Update README.md (#894)
- Fix infinite loop caused by incorrect timestamp tokens prediction (#914)
- drop python 3.7 support (#889)
- handle printing even if sys.stdout.buffer is not available (#887)
- Add TSV formatted output in transcript, using integer start/end time in milliseconds (#228)
- Added
--output_format
option (#333) - Handle
XDG_CACHE_HOME
properly fordownload_root
(#864) - use stdout for printing transcription progress (#867)
- Fix bug where mm is mistakenly replaced with hmm in e.g. 20mm (#659)
- print '?' if a letter can't be encoded using the system default encoding (#859)
The first versioned release available on PyPI