Replies: 2 comments
-
Hi @noau thanks for reaching out! So if I understand you would like to choose a top-level (like sentences) and return one at a time unless it is too big, then split lower? Makes sense. You might be able to get something a little closer by using the range syntax, with a lower desired size and a higher max, but I don't think it will do exactly what you are looking for. Less greedy, but may still be hard to find the right desired size. You can give it a try and I will make an issue out of this as a feature request, I think it could be a nice feature. |
Beta Was this translation helpful? Give feedback.
-
Hello. To generalize this request, it should help to have a feature that can first split a text into paragraphs, then split each paragraph into sentences. The user can of course manage the multi-level aspect. The default length limits, if any, can be generous, perhaps 10K for a paragraph, and 1K for a sentence. Hopefully this is not out of scope for this package, as there already exists code to split semantically. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your great work! I want to know that if it's possible to just split strings on a given semantic level instead of splitting greedy and only stops when the chunk exceeds some given size limits. For example, the two sentences above would be splitted into just
on a sentence level, ignoring the size limits.
Beta Was this translation helpful? Give feedback.
All reactions