We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It seems that the eds.normalizer pipe does not act
i am using edsnlp version 0.13.1
config = dict( lowercase=True, accents=True, quotes=False, spaces=False, pollution=True, )
nlp = edsnlp.blank("eds") nlp.add_pipe("eds.normalizer", config=config)
text = "Pneumopathie à NBNbWbWbNbWbNBNbNbWbW `coronavirus'"
doc = nlp(text)
print(doc.text)
I get unchanged text as a result
The text was updated successfully, but these errors were encountered:
Hi @OlivierHassanaly !
The doc.text always contain the original text of the document, to use the results of the eds.normalizer pipeline, you should use edsnlp.utils.doc_to_text.get_text as shown here http://aphp.github.io/edsnlp/latest/pipes/core/normalizer/#usage
doc.text
eds.normalizer
edsnlp.utils.doc_to_text.get_text
import edsnlp from edsnlp.utils.doc_to_text import get_text config = dict( lowercase=True, accents=True, quotes=False, spaces=False, pollution=True, ) nlp = edsnlp.blank("eds") nlp.add_pipe("eds.normalizer", config=config) text = "Pneumopathie à NBNbWbWbNbWbNBNbNbWbW `coronavirus'" doc = nlp(text) print(get_text(doc, attr='TEXT', ignore_excluded=True)) # Out: Pneumopathie à `coronavirus'
Sorry, something went wrong.
No branches or pull requests
It seems that the eds.normalizer pipe does not act
i am using edsnlp version 0.13.1
How to reproduce the bug
config = dict(
lowercase=True,
accents=True,
quotes=False,
spaces=False,
pollution=True,
)
nlp = edsnlp.blank("eds")
nlp.add_pipe("eds.normalizer", config=config)
text = "Pneumopathie à NBNbWbWbNbWbNBNbNbWbW `coronavirus'"
doc = nlp(text)
print(doc.text)
I get unchanged text as a result
The text was updated successfully, but these errors were encountered: