Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paying attention to trailing whitespace in string values #105

Open
vruusmann opened this issue Mar 9, 2018 · 1 comment
Open

Paying attention to trailing whitespace in string values #105

vruusmann opened this issue Mar 9, 2018 · 1 comment

Comments

@vruusmann
Copy link
Member

According to the PMML specification, it is permitted for string values to contain trailing whitespace. So, "that was a stupid decision" and "that was a stupid decision " should be considered equal when comparing them using the built-in equals function.

The problem has two sides to it. First, the trailing whitespace may be contained in user input values. Second, the trailing whitespace may be contained inside PMML documents (eg. <Constant>value </Constant>).

It's easy to deal with user input, as the InputField#prepare(Object) values should simply remove the whitespace. It is much more difficult to do anything about PMML documents as the JPMML-Evaluator library is not allowed to modify the in-memory org.dmg.pmml.PMML class model object as it pleases.

One workaround would be to implement a visitor (eg. org.jpmml.model.visitors.StringValueNormalizer), which traverses and fixes the PMML class model object before the JPMML-Evaluator library gets to see it.

It would be stupid to implement "does the string value contain trailing WS?"-check around every string operation - it's computationally too costly to be performing it with 99.99% PMML documents that contain proper string values.

@vruusmann
Copy link
Member Author

vruusmann added a commit to jpmml/jpmml-model that referenced this issue Mar 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant