Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow nested parsers on MessageProducer level (v2) #2078

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DmitryAstafyev
Copy link
Collaborator

This PR isn't for to be merged. It's a suggestion in the scope of work on #2073

Solution limitations:

  • only parsers without lifetime can be included as nested parsers (for example DLT parser cannot be nested)

The previous version is here #2077

Comment on lines +167 to +169
item.resolve(Some(Err(String::from(
"Fail to extract printable message",
))));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed continue;

@DmitryAstafyev DmitryAstafyev marked this pull request as draft August 27, 2024 11:27
Copy link
Member

@AmmarAbouZor AmmarAbouZor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @DmitryAstafyev for the alternative approach to solve the nested parsers problem. Here are my thoughts about it regarding the plugins system and some general thoughts:

I think this approach wouldn't affect the current implementation of the plugins system because we are only allowing the parser to return strings, so they will fall into the category of the default implementation to the new methods.
But if we decide to let the plugins return types other than strings then they would be involved like the other methods on LogMessage trait, but this wouldn't increase the complexity of the code by much at that point.

Another point to keep in mind that we want to change return type of the parse() method to be an iterator of results which could add some complexity for the solution in the producer loop.

I still think the main point here is how often such a situation could happen, because changing the main logic because of one special case wouldn't make much sense in the case of one -or very few- special cases. On the other hand, having a lot of parsers that have another parser nested in them wouldn't be a good solution if many parsers need a nested parser within them to performe the parser correctly.

From the performance point of view, I don't think these changes will have an impact on the parsers that don't need nested one, because all the code for that case will be removed at the compile time thanks generics in the producer loop.
But for the parsers that need the nesting, I think the direct nesting would be better for the performance because it's saving as moving data around between parser and producer and having to save them in a HashMap. We still need to measure here to know which tradeoff are we facing here

@DmitryAstafyev
Copy link
Collaborator Author

DmitryAstafyev commented Aug 27, 2024

Something, that I didn't expect to be honest :/

Screenshot from 2024-08-27 15-55-06

Test inputs:

  • same file
  • same workflow
  • same OS (linux)
  • measurement starts after MessageProducer has been created (all parsers are created, all stuff are ready)

Test workflow:

  • open chipmunk
  • open file
  • reopen file 6 times (without closing chipmunk, same instance)

@DmitryAstafyev
Copy link
Collaborator Author

The major approach is: that the parser doesn't own any nested parsers, but still can request additional (nested) parsing. At the same time, only the initial parser knows how a complete message should be represented. The Message Producer doesn't know it, the initial parser - knows. That's why the initial parser has a way to request an addition (nested) parsing.

At the same time, this solution gives flexibility and can be easily scaled.

To compare two ways:
Parser owns nested parser(s)

  • addition (nested) parsing happens inside the initial parser. It might cause problems with error handling, because now the work of the initial parser depends not only on itself but also on the nested parser's work.
  • the work of the nested parser is out of the control of MessageProducer. For example, if we would have some configuration for the nested parsing approach (like disable/enable) that logic will go inside the initial parser; in the other hand, if an initial parser doesn't own nested parsers and requires additional parsing - MessageProducer can keep such settings under control.

Parsers doesn't own nested parser(s) and requires parsing

  • MessageProducer keeps control of error handling of nested parsers
  • MessageProducer can be configured to enable/disable nested parsing

In the same time a weak point of suggested solution - an initial parser should keep in memory result of nested parsing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants