Skip to content

Commit

Permalink
Fix jieba bug (#163)
Browse files Browse the repository at this point in the history
  • Loading branch information
moria97 authored Aug 23, 2024
1 parent 49961e4 commit 312bac4
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/pai_rag/utils/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
## PUT in utils file and add stopword in TRIE structure.
def jieba_tokenizer(text: str) -> List[str]:
tokens = []
for w in jieba.lcut(text):
for w in jieba.cut(text):
token = w.lower()
if not stop_trie.match(token):
tokens.append(token)
Expand Down

0 comments on commit 312bac4

Please sign in to comment.