You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The point of the regex and the parts of the function that reference the regex is to identify these points, extract all dates other than the first update date (since that first "update" was actually the original posting), and delete anything that was entered in after those date markers. I can think of other situations in which it would be useful to identify a body of text by regex, and then delete everything after the nth occurrence of that regex.
It would also be useful to feed a dictionary into the regex parameter, where the names of each node indicate which field of the data the regex should be applied to. If a field does not have a corresponding name in the dictionary, then the entire regex portion of the clean_text function can be skipped.
If the above enhancements are made, clean_text can be used as a general-purpose tool for cleaning text fields. As the function stands right now, it can only be used for the one very specific task for which it was originally created.
The text was updated successfully, but these errors were encountered:
The
regex
parameter in the clean_text function is just a workaround. Thedesc
field in the lc data contains various updates following the pattern:r'(\d+|Borrower)\s+added\s+on\s+(\d{2}/\d{2}/\d{2})\s+>'
The point of the regex and the parts of the function that reference the regex is to identify these points, extract all dates other than the first update date (since that first "update" was actually the original posting), and delete anything that was entered in after those date markers. I can think of other situations in which it would be useful to identify a body of text by regex, and then delete everything after the nth occurrence of that regex.
It would also be useful to feed a dictionary into the regex parameter, where the names of each node indicate which field of the data the regex should be applied to. If a field does not have a corresponding name in the dictionary, then the entire regex portion of the clean_text function can be skipped.
If the above enhancements are made, clean_text can be used as a general-purpose tool for cleaning text fields. As the function stands right now, it can only be used for the one very specific task for which it was originally created.
The text was updated successfully, but these errors were encountered: