nltk · arildm · Nov 11, 2016
diff --git a/book/ch03.rst b/book/ch03.rst
@@ -1855,16 +1855,15 @@ NLTK's Regular Expression Tokenizer
 
 The function ``nltk.regexp_tokenize()`` is similar to ``re.findall()`` (as
 we've been using it for tokenization).  However, ``nltk.regexp_tokenize()``
-is more efficient for this task, and avoids the need for special treatment of parentheses.
-For readability we break up the regular expression over several lines
-and add a comment about each line.  The special ``(?x)`` "verbose flag"
-tells Python to strip out the embedded whitespace and comments.
+is more efficient for this task.  For readability we break up the regular
+expression over several lines and add a comment about each line.  The special
+``(?x)`` "verbose flag" tells Python to strip out the embedded whitespace and comments.
 
     >>> text = 'That U.S.A. poster-print costs $12.40...'
     >>> pattern = r'''(?x)    # set flag to allow verbose regexps
-    ...     ([A-Z]\.)+        # abbreviations, e.g. U.S.A. 
-    ...   | \w+(-\w+)*        # words with optional internal hyphens
-    ...   | \$?\d+(\.\d+)?%?  # currency and percentages, e.g. $12.40, 82%
+    ...     (?:[A-Z]\.)+        # abbreviations, e.g. U.S.A. 
+    ...   | \w+(?:-\w+)*        # words with optional internal hyphens
+    ...   | \$?\d+(?:\.\d+)?%?  # currency and percentages, e.g. $12.40, 82%
     ...   | \.\.\.            # ellipsis
     ...   | [][.,;"'?():-_`]  # these are separate tokens; includes ], [
     ... '''