Skip to content

Sudachi version 0.7.0

Compare
Choose a tag to compare
@github-actions github-actions released this 16 Aug 03:00

Highlights

  • Tokenizer.tokenize API returns MorphemeList instead of List<Morpheme>. This change is ABI-incompatible with previous versions and applications which use Sudachi require recompilation. The change should be source-compatible with no changes required to the source code which uses Sudachi.
  • New API: MorphemeList.split: resplit C-mode token sequence to lower level without re-analyzing the whole string.
  • Added relaxed boundary matching mode for Regex OOV handler