Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FST transducer support #1087

Open
eharcevs opened this issue Sep 8, 2023 · 9 comments
Open

FST transducer support #1087

eharcevs opened this issue Sep 8, 2023 · 9 comments

Comments

@eharcevs
Copy link

eharcevs commented Sep 8, 2023

Describe your feature request

I'm trying to upgrade from regex-automata 0.1.9 and have noticed that newer versions don't have a transducer feature anymore, which allowed using a regex DFA as an FST Automaton.
Is there a plan to bing this feature back? If not what could be the workaround for using regex for fst?
Thanks in advance

@BurntSushi
Copy link
Member

Yes, I just haven't gotten around to it yet. Note that you can always write the trait impls yourself for a wrapper type, so you can unblock yourself if necessary. It could be some time before I get to it.

I will probably add the impls in a new crate or as an optional feature on fst itself. I haven't decided yet.

@eharcevs
Copy link
Author

eharcevs commented Sep 8, 2023

Got it, thanks!

@eharcevs eharcevs closed this as completed Sep 8, 2023
@BurntSushi
Copy link
Member

We can leave this open to track when it's done. Thanks!

@formbook
Copy link

please Mr Sushi, deliver us from version 0.1.10

@tisonkun
Copy link

FYI I implement it in the wrapper pattern at - GreptimeTeam/greptimedb#3575

I feel that the implementation is possible to contribute back upstream so that we get it maintained and evolved with the upstream, as well as porting it back can possibly discover some issues that should be fixed.

I have some time to make a patch, but here first to ask if it's the way to go.

@BurntSushi
Copy link
Member

Sorry, but I'm not going to have time to review this any time soon unfortunately. Submitting a patch probably isn't fruitful at this time, because it's going to require some guidance from me probably in how the crates should be structured. For example, I don't think I want to take a dependency on fst inside of regex-automata. I think I want to flip that around so that fst has an optional dependency on regex-automata. And the same for aho-corasick: https://github.com/BurntSushi/aho-corasick/blob/56256dca1bcd2365fd1dc987c1c06195429a2e2c/Cargo.toml#L32-L48

@BurntSushi
Copy link
Member

Note that for aho-corasick, I actually wrote out the implementation for it with tests, but it's not currently included in the crate: https://github.com/BurntSushi/aho-corasick/blob/56256dca1bcd2365fd1dc987c1c06195429a2e2c/src/transducer.rs

The unanchored/anchored API layer (and perhaps something more sophisticated) will be needed for regex-automata I think. But I haven't given it a ton of thought yet.

@tisonkun
Copy link

Thanks for your feedback! I won't over commit it so I'll stop here. If I happen to have time make the change on the fst crate, I'll comment there.

Then it seems we can close this issue as won't do in this crate and perhaps create a mirror in fst?

@BurntSushi
Copy link
Member

I think I'm fine leaving this issue open. If you want to open one on fst too, that's fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants