While built-in string methods have limited flexibility and regular expressions have limited expressive power, both can still be leveraged in creative ways to implement scalable workflows that process and analyze text data. This article explores these tools and introduces a few useful peripheral techniques within the context of a use case involving a large text data corpus: the set of article abstracts found in the English-language edition of Wikipedia.
While built-in string methods and regular expressions have limitations, they can be leveraged in creative ways to implement scalable workflows that process and analyze text data. This article explores these tools and introduces a few useful peripheral techniques within the context of a use case involving a large text data corpus.
License
python-supply/strings-regular-expressions-and-text-data-analysis
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
While built-in string methods and regular expressions have limitations, they can be leveraged in creative ways to implement scalable workflows that process and analyze text data. This article explores these tools and introduces a few useful peripheral techniques within the context of a use case involving a large text data corpus.