Reading speeches #282

ChristophLeonhardt · 2023-11-20T14:03:06Z

Until recently, the workflow to read speeches from speech bundles in polmineR has been the following:

speeches <- corpus("GERMAPARL2") %>%
  subset(protocol_date == "2001-09-12") %>%
  subset(p) %>%
  as.speeches(
    s_attribute_date = "protocol_date",
    s_attribute_name = "speaker_name"
  )

speeches[[1]] %>% read()

This is also described in a blog post for a GermaParl2 beta release here.

In polmineR v0.8.9.9001 (maybe earlier) this does not work anymore as this throws an error: "s-attribute does not have values".

What does work instead is the following:

speeches <- corpus("GERMAPARL2") %>%
  subset(protocol_date == "2001-09-12") %>%
  subset(p_type == "speech") %>%
  as.speeches(
    s_attribute_date = "protocol_date",
    s_attribute_name = "speaker_name"
  )

speeches[[1]] %>% read()

This uses p_type instead of p.

However, I assume that the result is not identical. In the second scenario, I omit all interjections (although they remain visible in the output of read()) while I keep all paragraphs in the first scenario.

What is the expected behavior in these scenarios and how would I read speeches without necessarily removing the interjections first (in case I might want to analyze the speech bundle further)?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading speeches #282

Reading speeches #282

ChristophLeonhardt commented Nov 20, 2023

Reading speeches #282

Reading speeches #282

Comments

ChristophLeonhardt commented Nov 20, 2023