Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

substring #327

Open
NicoLaval opened this issue May 20, 2024 · 4 comments
Open

substring #327

NicoLaval opened this issue May 20, 2024 · 4 comments
Labels
bug Something isn't working Needs more information

Comments

@NicoLaval
Copy link
Collaborator

NicoLaval commented May 20, 2024

@noahboerger you reported substr is not supported.

I thought it was, for scalars, components and datasets.

Could you report a failed example please?

@NicoLaval NicoLaval added question Further information is requested bug Something isn't working labels May 20, 2024
@NicoLaval
Copy link
Collaborator Author

Substring syntax (In Trevas param2 is the start_index, and param3 the length. The user test seems to assume, that param2 is the start_index and param3 the end_index (inclusive))

@noahboerger could you confirm substr doesn't cause exception throw but a wrong result because of the previous remark?

@NicoLaval NicoLaval added Needs more information and removed question Further information is requested labels May 20, 2024
@noahboerger
Copy link
Collaborator

The first note on substr was just that it is currently not supported in the shape DS_r := substr( DS_1, 1, 2).

The different syntax also seems to be the correct one according to the reference manual (p. 76 f.).
So i would propose to close the issue as it is only an issue for our used testcases.

@NicoLaval
Copy link
Collaborator Author

This syntax has to work, and is implemented in Trevas I think.
The condition on the handled dataset is: dataset with only string measures.

@noahboerger, can you provide a failed example with the stack trace please?

@noahboerger
Copy link
Collaborator

@NicoLaval you are right, with a dataset with only string measures it is working also on my side. The dataset i have tested it with had the wrong format

Id_1 Id_2 num_1 Text_1
IDENTIFIER IDENTIFIER MEASURE MEASURE
STRING STRING DOUBLE STRING

The concrete test case i have executed is the one from BdI "11. conditional/if-then-else_2", which is trying to execute it with a dataset that is also containing a measure with an additional double type. This testcase has failed with the following error:

Occured error

Exception

fr.insee.vtl.engine.exceptions.FunctionNotFoundException: function 'substr(Double, Long, Long)' not found
  at fr.insee.vtl.engine.visitors.expression.functions.GenericFunctionsVisitor.invokeFunction(GenericFunctionsVisitor.java:146)
  at fr.insee.vtl.engine.visitors.expression.functions.StringFunctionsVisitor.visitSubstrAtom(StringFunctionsVisitor.java:183)
  at fr.insee.vtl.engine.visitors.expression.functions.StringFunctionsVisitor.visitSubstrAtom(StringFunctionsVisitor.java:20)
  at fr.insee.vtl.parser.VtlParser$SubstrAtomContext.accept(VtlParser.java:3074)
  at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
  at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitStringFunctions(ExpressionVisitor.java:254)
  at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitStringFunctions(ExpressionVisitor.java:41)
  at fr.insee.vtl.parser.VtlParser$StringFunctionsContext.accept(VtlParser.java:1098)
  at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:46)
  at fr.insee.vtl.parser.VtlBaseVisitor.visitFunctionsExpression(VtlBaseVisitor.java:90)
  at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitFunctionsExpression(ExpressionVisitor.java:491)
  at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitFunctionsExpression(ExpressionVisitor.java:41)
  at fr.insee.vtl.parser.VtlParser$FunctionsExpressionContext.accept(VtlParser.java:629)
  at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
  at fr.insee.vtl.engine.visitors.AssignmentVisitor.visitAssignment(AssignmentVisitor.java:51)
  at fr.insee.vtl.engine.visitors.AssignmentVisitor.visitTemporaryAssignment(AssignmentVisitor.java:59)
  at fr.insee.vtl.parser.VtlParser$TemporaryAssignmentContext.accept(VtlParser.java:372)
  at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
  at fr.insee.vtl.engine.VtlScriptEngine.evalStream(VtlScriptEngine.java:263)
  at fr.insee.vtl.engine.VtlScriptEngine.eval(VtlScriptEngine.java:282)
  at java.scripting/javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:262)
  at fr.insee.trevas.jupyter.VtlKernel.eval(VtlKernel.java:305)
  at io.github.spencerpark.jupyter.kernel.BaseKernel.handleExecuteRequest(BaseKernel.java:334)
  at io.github.spencerpark.jupyter.channels.ShellChannel.lambda$bind$0(ShellChannel.java:64)
  at io.github.spencerpark.jupyter.channels.Loop.lambda$new$0(Loop.java:21)
  at io.github.spencerpark.jupyter.channels.Loop.run(Loop.java:78)

The test from BdI assumes that the additional non string measure is kept without being manipulated. But I think your implementation is right, as the table of VTL-ML-Operators in the reference manual (p. 17), requires the input to be in the following form:

 op ::
dataset { measure<string> _+ }
| component<string>
| string
pattern1, pattern2 ::
component<string>
| string

so i think it the correct behaviour to fail here.

I did not dive so deep into this test case when i executed it the first time, but then it is a wrong test case. I am not sure if the trevas error message could be a bit more clear, as the engine still tries to execute the replace-operator even with a mismatching input dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Needs more information
Projects
None yet
Development

No branches or pull requests

2 participants