Filters
Adds the ability to filter tokens. Use the Filter()
method on Segmenter
and Scanner
.
Any func([]byte) bool
can serve as a filter, with arbitrary logic. An example included filter is Wordlike
, which removes whitespace and punctuation, returning only ‘words’ in the common sense.
See also the Contains()
and Entirely()
methods, which allow creation of filters based on Unicode categories.