Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate per-workspace CER + WER #35

Open
mikegerber opened this issue Oct 17, 2020 · 4 comments
Open

Generate per-workspace CER + WER #35

mikegerber opened this issue Oct 17, 2020 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@mikegerber
Copy link
Member

mikegerber commented Oct 17, 2020

  1. It's easy to calculate this from the individual CER/WER and the character/word counts.
  2. But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file
@mikegerber mikegerber added the enhancement New feature or request label Oct 17, 2020
@mikegerber mikegerber self-assigned this Oct 17, 2020
@mikegerber
Copy link
Member Author

mikegerber commented Oct 17, 2020

But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file

@kba @bertsky @cneud Any thoughts on this?

@bertsky
Copy link
Contributor

bertsky commented Oct 17, 2020

But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file

@kba @bertsky @cneud Any thoughts on this?

Yes, this became possible when we agreed on an "official" way to have global (document-wide) files in the METS. You just put it in there without a pageId (i.e. without a reference of the file in the structMap) and use a certain convention for the file ID (with FULLDOWNLOAD IIRC).

Now that different MIME types are allowed in fileGrps, having an output fileGrp with page-wise and document-global reports should be no problem.

@bertsky
Copy link
Contributor

bertsky commented Oct 30, 2020

BTW I believe having a measurement of CER standard deviation or variance is also useful. See here for an implementation.

@mikegerber
Copy link
Member Author

(Closing the issue was an accident, I often hit the wrong buttons)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants