Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocrd_tool: allow object for path_in_archive of resources #235

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kba
Copy link
Member

@kba kba commented Dec 13, 2022

During debugging bertsky/ocrd_detectron2#14 I realized that my assumption that every archive would only contain a single resource was wrong. The detectron2 models consist of a pytorch NN and a YAML description. This requires redundancy in the description and requires downloading the same archive twice.

With this change (and corresponding implementation in core), it would be possible to simplify

- description: DocBank via LayoutLM X101-FPN config
  name: DocBank_X101.yaml
  type: archive
  path_in_archive: X101/X101.yaml
  size: 526
  url: https://layoutlm.blob.core.windows.net/docbank/model_zoo/X101.zip
- description: DocBank via LayoutLM X101-FPN config
  name: DocBank_X101.pth
  type: archive
  path_in_archive: X101/model.pth
  size: 835606605
  url: https://layoutlm.blob.core.windows.net/docbank/model_zoo/X101.zip

to

- description: DocBank via LayoutLM X101-FPN config
  name: DocBank_X101.pth
  type: archive
  path_in_archive:
    DocBank_X101.pth: X101/model.pth
    DocBank_X101.yaml: X101/X101.yaml
  size: 783884362
  url: https://layoutlm.blob.core.windows.net/docbank/model_zoo/X101.zip

Also, this way the progressbar would be working again because the size attribute would always refer to the archive, not the file/folder in the archive.

@kba kba requested a review from bertsky December 13, 2022 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant