Skip to content

Commit

Permalink
fix: add tests for whitespace token issue in JiebaTokenizer
Browse files Browse the repository at this point in the history
  • Loading branch information
lyirs committed Apr 28, 2024
1 parent df21c83 commit 2805e5d
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions tests/nlu/tokenizers/test_jieba_tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ def create_jieba(config: Optional[Dict] = None) -> JiebaTokenizer:
["Micheal", "你好", "吗", "?"],
[(0, 7), (7, 9), (9, 10), (10, 11)],
),
(
"安装 rasa 应用",
["安装", "rasa", "应用"],
[(0, 2), (3, 7), (8, 10)],
),
],
)
def test_jieba(text, expected_tokens, expected_indices):
Expand Down

0 comments on commit 2805e5d

Please sign in to comment.