-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
detect document language across all partitioners #1627
Conversation
… Unstructured data
35e82b9
to
01e29df
Compare
just realized the auto partitioner was not yet updated - we need to pass the languages parameter through to each partitioner |
also please bump dev version in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes lgtm! update final changelog version / fix all conflicts and tests before merging
…res update (#1704) This pull request includes updated ingest test fixtures. Please review and merge if appropriate. Co-authored-by: Coniferish <[email protected]>
Summary
Closes #1534 and #1535
Detects document language using
langdetect
package.Creates new kwargs for user to set the document language (
languages
) or detect the language at the element level instead of the default document level (detect_language_per_element
)