Skip to content

imbue11235/words

Repository files navigation

Words Test Status codecov Go Reference

Go package words provides capabilities for extracting words from a string, by a collection of rules.

Rules

  1. Invalid UTF8-strings will not be split
  2. Hyphenated words will be treated as individual words unless disabled. E.g. "small-town" => []{"small", "town"}
  3. If the character is a space, punctuation or symbol, it will be voided, unless disabled. E.g. "my_string here" => []{"my", "string", "here"}
  4. Characters of same type in sequence, will be put together.
  5. If the current character is a lowercase, and the last character of the previous word was uppercase, the uppercase letter will be moved to the lowercase string. E.g. "YAMLParser" => []{"YAML", "Parser"}

Installation

$ go get github.com/imbue11235/words

Usage

Basic usage

words.Extract("Do you prefer camelCase to snake_case?") 
// => []string{"Do", "you", "prefer", "camel", "case", "to", "snake", "case")

words.Extract("YAMLParser")
// => []string{"YAML", "Parser"}

words.Extract("Bose QC35")
// => []string{"Bose", "QC", "35"}

With options

To further customize the extraction, options can be passed to the extract-method.

Punctuation

To include punctuation

words.Extract("So, now punctuation will be included.", words.IncludePunctuation())
// => []string{"So", ",", "now", "punctuation", "will", "be", "included", "."}

Spaces

To include spaces

words.Extract("So   many   spaces", words.IncludeSpaces())
// => []string{"So", "   ", "many", "   ", "spaces"}

Symbols

To include symbols

words.Extract("Some>String", words.IncludeSymbols())
// => []string{"Some", ">", "String"}

Hyphenated words

To allow hyphenated words

words.Extract("An anti-clockwise direction", words.AllowHyphenatedWords())
// => []string{"An", "anti-clockwise", "direction"}

Multiple options

To use multiple options at the same time

words.Extract("Using multiple options!" words.IncludeSpaces(), words.IncludePunctuation())
// => []string{"Using", " ", "multiple", " ", "options", "!"}

License

This project is licensed under the MIT license.