Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tokenizer behavior on multiple contiguous split characters #271

Merged
merged 1 commit into from
Jul 17, 2024

Conversation

lucaong
Copy link
Owner

@lucaong lucaong commented Jul 17, 2024

Multiple contiguous spaces or punctuation characters should be clustered together when splitting

Multiple contiguous spaces or punctuation characters should be clustered
together when splitting
@lucaong lucaong merged commit a8e6765 into master Jul 17, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant