Skip to content

Commit

Permalink
Merge pull request #109 from breezedeus/dev
Browse files Browse the repository at this point in the history
bugfixes
  • Loading branch information
breezedeus authored May 20, 2024
2 parents bdfe05d + 021eaab commit dc64fbc
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 2 deletions.
12 changes: 12 additions & 0 deletions docs/RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
# Release Notes

## Update 2024.05.20:**V1.1.0.4** Released

Major changes:

* set `table_as_image` as `True` if `self.table_ocr` is not available.
* fix typo: https://github.com/breezedeus/Pix2Text/pull/108 . Thanks to [@billvsme](https://github.com/billvsme).

主要变更:

* 如果 `self.table_ocr` 不可用,将 `table_as_image` 设置为 `True`
* 修复拼写错误:https://github.com/breezedeus/Pix2Text/pull/108 。感谢 [@billvsme](https://github.com/billvsme)

## Update 2024.05.19:**V1.1.0.3** Released

Major changes:
Expand Down
2 changes: 1 addition & 1 deletion pix2text/__version__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
# [Pix2Text](https://github.com/breezedeus/pix2text): an Open-Source Alternative to Mathpix.
# Copyright (C) 2022-2024, [Breezedeus](https://www.breezedeus.com).

__version__ = '1.1.0.3'
__version__ = '1.1.0.4'
20 changes: 20 additions & 0 deletions pix2text/page_elements.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,16 @@ def to_markdown(
root_url: Optional[str] = None,
markdown_fn: Optional[str] = 'output.md',
) -> str:
"""
Convert the Page to markdown.
Args:
out_dir (Union[str, Path]): The output directory.
root_url (Optional[str]): The root url for the saved images in the markdown files.
markdown_fn (Optional[str]): The markdown file name. Default is 'output.md'.
Returns: The markdown string.
"""
out_dir = Path(out_dir)
out_dir.mkdir(exist_ok=True, parents=True)
self.elements.sort()
Expand Down Expand Up @@ -293,6 +303,16 @@ def to_markdown(
root_url: Optional[str] = None,
markdown_fn: Optional[str] = 'output.md',
) -> str:
"""
Convert the Document to markdown.
Args:
out_dir (Union[str, Path]): The output directory.
root_url (Optional[str]): The root url for the saved images in the markdown files.
markdown_fn (Optional[str]): The markdown file name. Default is 'output.md'.
Returns: The markdown string.
"""
out_dir = Path(out_dir)
out_dir.mkdir(exist_ok=True, parents=True)
self.pages.sort(key=lambda page: page.number)
Expand Down
4 changes: 3 additions & 1 deletion pix2text/pix_to_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ def from_config(
# layout_parser = LayoutParser.from_config(layout_config, device=device)
layout_parser = DocXLayoutParser.from_config(layout_config, device=device)
text_formula_ocr = TextFormulaOCR.from_config(
text_formula_config, enable_formula=enable_formula, device=device
text_formula_config, enable_formula=enable_formula, device=device, **kwargs
)
if enable_table:
table_ocr = TableOCR.from_config(
Expand Down Expand Up @@ -252,6 +252,8 @@ def recognize_page(
layout_kwargs = deepcopy(kwargs)
layout_kwargs['resized_shape'] = resized_shape
layout_kwargs['table_as_image'] = kwargs.get('table_as_image', False)
if self.table_ocr is None:
layout_kwargs['table_as_image'] = True
layout_out, column_meta = self.layout_parser.parse(
img0.copy(), **layout_kwargs,
)
Expand Down

0 comments on commit dc64fbc

Please sign in to comment.