Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better interpretation of requester_accuracy_target #42

Open
reckart opened this issue Oct 17, 2021 · 0 comments
Open

Better interpretation of requester_accuracy_target #42

reckart opened this issue Oct 17, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@reckart
Copy link
Contributor

reckart commented Oct 17, 2021

Is your feature request related to a problem? Please describe.
Per HUMAN Protocol definition, the requester_accuracy_target is defined as:

(float) 0-1 (optional) stop asking when min repeats is met and task accuracy exceeds this target.

However, the workload management in INCEpTION does currently not support a quality-based definition of when a document is completely annotated. Thus, the HUMAN Protocol adapter currently maps requester_accuracy_target to the confidence threshold for a threshold-based merge strategy:

If set to a decimal value between 0.0 and 1.0, the annotators labels are merged automatically before the result submission. If unset (default), no automatic merging is performed as part of the results submisson. If there is more than one label assigned to a span, then this parameter controls how many annotators must have chosen the majority label over the second-best label in order for the majority label to be considered for auto-merging. If this parameter is 0, then the majority label is always merged except if there is a tie with the second-best label. If the parameter is 1 then the majority label is used only if all annotators assigned it unanimously (i.e. there is no second-best label).

That means, the setting currently does not control when a document is considered done but rather it is used after a document has been found as done (by having been annotated by the min number of annotators) in order to determine how the annotators annotations are merged into an auto-curated representation.

Describe the solution you'd like
A document should not be considered as done until:

  • every annotation in a document has been confirmed by at least requester_min_repeats annotators
  • every annotation in a document has met the confidence threshold equal to requester_accuracy_target as per the threshold-based merge strategy
  • if a document has been annotated by requester_max_repeats, then it is considered done even if the two conditions above are not met - this means that annotations not meeting the two above conditions will not be part of the auto-curated representation
@reckart reckart added the enhancement New feature or request label Oct 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant