Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when calculating local_alignment #82

Open
ManuelBurghardt opened this issue Jan 11, 2018 · 1 comment
Open

Error when calculating local_alignment #82

ManuelBurghardt opened this issue Jan 11, 2018 · 1 comment

Comments

@ManuelBurghardt
Copy link

Background: In order to trace Shakespearean Intertextuality, I am tokenizing Shakespeare texts (hypotexts) as 9grams and align each ngram (align_local) with other texts, e.g. by Terry Pratchett or Charles Dickens (hypertexts). I loop through all the ngrams and only return alignments that are above a certain alignment score. To speed the process up a little bit, I only use every third ngram, which should still be sufficient overlap to not miss any potential quotes (WyrdSisters_Macbeth_minimal.R.zip).

Problem: However, I am occasionally getting the following error message:

Error in b_out[out_i] <- b_orig[row_i - 1] : replacement has length zero

Here is some more context via a screenshot from my console:

screenshot_error

I cannot really reproduce the error, but it seems to depend on how I set the count-variable, which has an effect on the ngram I start with. I assume the error has something to do with how the Smith-Waterman algorithm builds up its matrix of values, or – looking into the TextReuse code – more concretely with the output vector construction ...

  # Place our first known values in the output vectors
  b_out[out_i] <- b_orig[row_i - 1]
  a_out[out_i] <- a_orig[col_i - 1]
out_i = out_i + 1L # Advance the out vector position

I assume a related problem is described in StackOverflow, but with no real solution.

Since the overall approach seems to work pretty well when it comes to discovering verbatim or near verbatim Shakespeare text reuse in other hypertexts, I would be really happy to understand what is happening here, and how I can possibly fix it.

@DaniSchenk
Copy link

It's not a solution or an answer to your question, but a tryCatch helps to keep a for/apply running. Maybe it helps someone...

alignment <- tryCatch({
  align_local(x, hypertext)
}, error=function(err) {
  return(FALSE)
}, warning=function(war) {
  return(FALSE)
})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants