Updated approx_pos method in dataset.py #48

hendrixjoseph · 2022-05-06T13:48:09Z

Updated ParsedDictionaryDefinitionDataset approx_pos method in dataset.py.

Something must've changed in how a stanza.models.common.doc.Word is structured, causing the method def approx_pos(cls, nlp, sentence, lookup_idx, lookup_len): to fail.

The Word object now looks something like:

{
  "id": 6,
  "text": "a",
  "upos": "DET",
  "xpos": "DT",
  "feats": "Definite=Ind|PronType=Art",
  "start_char": 23,
  "end_char": 24
}

The plus side of this is that the start_char and end_char can now be extracted without using regex.

I've tested the change in Google Colab.

Updated ParsedDictionaryDefinitionDataset approx_pos method in dataset.py. Something must've changed in how a `stanza.models.common.doc.Word` is structured, causing the method `def approx_pos(cls, nlp, sentence, lookup_idx, lookup_len):` to fail. The Word object now looks something like: ```json { "id": 6, "text": "a", "upos": "DET", "xpos": "DT", "feats": "Definite=Ind|PronType=Art", "start_char": 23, "end_char": 24 } ``` The plus side of this is that the `start_char` and `end_char` can now be extracted without using regex. I've tested the change in Google Colab.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated approx_pos method in dataset.py #48

Updated approx_pos method in dataset.py #48

hendrixjoseph commented May 6, 2022

Updated approx_pos method in dataset.py #48

Are you sure you want to change the base?

Updated approx_pos method in dataset.py #48

Conversation

hendrixjoseph commented May 6, 2022