You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if (!is.null(stoplist)) ts<-ts[!ts[["word"]] %in%stoplist]
This results in an error later when the annotation object is created since some slots in the object are not empty.
Discussion
I assume that the obvious part of the solution is to check whether ts is empty (i.e. whether nrow(ts) == 0L) after applying the list of stopwords. However, I am not sure what should be returned here.
Normally, the return value would be an annotation object. Is returning NULL compatible with the usual workflows here or would it be better to return an empty AnnotatedPlainTextDocument instead?
Scenario
I want to decode a document to an AnnotatedPlainTextDocument using a list of stopwords. If all tokens are removed when doing so, the process fails.
Example
As a minimal reproducible example, consider the following subcorpus:
(The subcorpus is chosen because it is very short)
Now let's assume that we want to decode the subcorpus to an AnnotatedPlainTextDocument while removing stopwords:
This fails because all tokens are removed:
Issue
The initial issue is that the data.table
ts
becomes empty if the stoplist is applied:polmineR/R/decode.R
Line 102 in 650c75f
This results in an error later when the annotation object is created since some slots in the object are not empty.
Discussion
I assume that the obvious part of the solution is to check whether
ts
is empty (i.e. whethernrow(ts) == 0L
) after applying the list of stopwords. However, I am not sure what should be returned here.Normally, the return value would be an annotation object. Is returning NULL compatible with the usual workflows here or would it be better to return an empty AnnotatedPlainTextDocument instead?
This is somewhat related to issue #285.
The text was updated successfully, but these errors were encountered: