New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_dbpedia_uris() aborts: Error: protect(): protection stack overflow #52
Comments
The error does not occurr when I break up the entire corpus into smaller pieces (legistlative periods), but I see it for the 17th legislative period of GERMAPARL2. Observations:
To be tested experimentally: options(expressions = 5e5) We might also look at This is a minimal version of the code I used that resulted in the error: library(RcppCWB)
library(polmineR)
library(dplyr)
library(dbpedia)
logfile <- tempfile()
p_strucs <- s_attr("GERMAPARL2", s_attribute = "ne", registry = Sys.getenv("CORPUS_REGISTRY")) %>%
s_attr_size() %>%
(`-`)(1) %>%
seq(from = 0L, to = .) %>%
get_region_matrix(corpus = "GERMAPARL2", s_attribute = "ne", strucs = .) %>%
.[, 1L] %>%
cl_cpos2struc(corpus = "GERMAPARL2", s_attribute = "p", cpos = .) %>%
unique()
p_types <- cl_struc2str(corpus = "GERMAPARL2", s_attribute = "p_type", struc = p_strucs)
p_strucs_speech <- p_strucs[which(p_types == "speech")]
paras <- corpus("GERMAPARL2") %>%
subset(p %in% !!p_strucs_speech) %>%
subset(protocol_lp == "17") %>%
split(s_attribute = "p", values = FALSE)
uritab <- get_dbpedia_uris(
x = paras,
language = getOption("dbpedia.lang"),
max_len = 5600L,
confidence = 0.35,
support = 20,
api = getOption("dbpedia.endpoint"),
logfile = logfile,
retry = 3,
verbose = FALSE,
expand_to_token = TRUE,
progress = TRUE,
s_attribute = "ne_type"
) |
To get a better understanding of the issue, I tried to provoke it as follows: But it works without a problem, unfortunately. How can we provoke the error? library(data.table)
dt <- data.table(
A = 1:100,
B = 1:100,
C = 1:100,
D = 1:100,
E = 1:100,
F = 1:100,
G = 1:100,
H = 1:100,
I = 1:100,
J = rep(list(a = "asdf", b = "asdf", c = "sdf"), times = 100)
)
dts <- lapply(1:500000, function(i) copy(dt))
foo <- rbindlist(dts) |
Confirmed: The error does not occur when we drop the column "types" with list values. Dropping the column is implemented now only for |
I think I can second that. With the nested lists in |
We now have the argument |
I added a paragraph explaining this issue in the documentation of the |
Running this ...
Results in this error:
Error: protect(): protection stack overflow
See this at Stackoverflow as a potential solution:
https://stackoverflow.com/questions/32826906/how-to-solve-protection-stack-overflow-issue-in-r-studio
So should I include something such as
before this expression?
The text was updated successfully, but these errors were encountered: