Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Str.dropPrefix and Str.dropSuffix builtins #7007

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lukewilliamboswell
Copy link
Collaborator

Implements the builtins from #664

Note I changed the api from what was described in the issue. This was easier to implement and test, and I think will be nicer to use.

  • dropPrefix : Str, Str -> Result Str {} to dropPrefix : Str, Str -> Str
  • dropSuffix : Str, Str -> Result Str {} to dropSuffix : Str, Str -> Str

Also, I am not 100% around the use of substringUnsafe here. I think it's ok because we are starting with a valid utf8 string, and using a valid utf8 string to drop so I don't think there is anyway to drop and now have invalid utf8. Is there a way to test this? maybe we should do some fuzzing here?

@lukewilliamboswell lukewilliamboswell mentioned this pull request Aug 17, 2024
50 tasks
@lukewilliamboswell
Copy link
Collaborator Author

dropPrefix = \haystack, prefix ->
if Str.startsWith haystack prefix then
start = Str.countUtf8Bytes prefix
len = Num.subSaturated (Str.countUtf8Bytes haystack) start

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this subtraction guaranteed to never exercise the saturating behavior since start = Str.countUtf8Bytes prefix <= Str.countUtf8Bytes haystack since Str.startsWith haystack prefix? It would probably be more performant to use wrapping subtraction.

@@ -752,6 +754,7 @@ countUtf8Bytes : Str -> U64

## string slice that does not do bounds checking or utf-8 verification
substringUnsafe : Str, U64, U64 -> Str
# = RocStr, start, len -> RocStr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray comment

Suggested change
# = RocStr, start, len -> RocStr

@timotree3
Copy link

Also, I am not 100% around the use of substringUnsafe here. I think it's ok because we are starting with a valid utf8 string, and using a valid utf8 string to drop so I don't think there is anyway to drop and now have invalid utf8. Is there a way to test this? maybe we should do some fuzzing here?

I agree with your reasoning as long as Str.startsWith haystack prefix == Bool.true guarantees that haystack starts with the same bytes as prefix. If it was weakened to something like "haystack starts with characters that are the same as prefix after some kind of normalization" then this implementation would become buggy.

Copy link

Thank you for your contribution! Sometimes PRs end up staying open for a long time without activity, which can make the list of open PRs get long and time-consuming to review. To keep things manageable for reviewers, this bot automatically closes PRs that haven’t had activity in 60 days. This PR hasn’t had activity in 30 days, so it will be automatically closed if there is no more activity in the next 30 days. Keep in mind that PRs marked Closed are not deleted, so no matter what, the PR will still be right here in the repo. You can always access it and reopen it anytime you like!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants