You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Collatinus.Decliner produces incorrect results with words such as 'omnis' and 'parens'. This is a separate problem to the one reported at #1127 (problems with 'puer'), and is not solved by the changes to lat.py recommended there.
Python version: 3.9.13
CLTK version: CLTK 1.1.6
Windows 10
Running the following script:
from cltk.morphology.lat import CollatinusDecliner
decliner = CollatinusDecliner()
print(decliner.decline("omnis",False,False))
We see the following erroneous output:
[('omnis', '--s---mn-'), ('omnis', '--s---mv-'), ('omnem', '--s---ma-'), ('omniem', '--s---ma-'), ('omnis', '--s---mg-'), ('omniis', '--s---mg-') etc.]
There is no such form as 'omniem' or 'omniis'. CollatinusDecliner has created both "omn-" and "omni-" as roots for the same root_id.
We should expect to see (and Collatinus gets this right):
[('omnis', '--s---mn-'), ('omnis', '--s---mv-'), ('omnem', '--s---ma-'), ('omnis', '--s---mg-'), etc.]
Running the following script:
from cltk.morphology.lat import CollatinusDecliner
decliner = CollatinusDecliner()
print(decliner.decline("parens",False,False))
We see the following output:
[('parens', '--s---mn-'), ('parens', '--s---mv-'), ('pareiem', '--s---ma-'), ('parentem', '--s---ma-'), ('pareiis', '--s---mg-'), ('parentis', '--s---mg-')...
There are no forms 'pareie-'
We would expect to see:
[('parens', '--s---mn-'), ('parens', '--s---mv-'), ('parentem', '--s---ma-'), ('parentis', '--s---mg-')
Once again, CollatinusDecliner has created both "parent-" and "parei-" as roots
Certainly in the case of 'parens' there seems to be a problem in the "cltk_data\lat\model\lat_models_cltk\lemmata\collatinus\collected.json" files. The model for 'parens' is given as 'infans', and in the models section for 'infans' we see the following root info:
This cannot be right, as there is no circumstance in which we would remove two letters and replace them with an 'i' (which is what has happened here).
Oddly we find the same root info for 'fortis' which is the model for 'omnis' (and 'fortis' also declines incorrectly):
"fortis": {"R": {"0": ["2", ""], "1": ["2", "i"], "2": ["2", "issim"], "4": ["K", null], "5": ["2", "i"]}
The text was updated successfully, but these errors were encountered:
I've been trying to work out exactly how the decliner works (and how it works in Collatinus itself) and I may have got this wrong - but it seems to me that:
a] if there is an entry in the lemma-entry for geninf, this should be assigned to root_id 1
b] if there is an entry in the lemma-entry for perf, this should be assigned to root_id 2
c] and in both cases these should replace anything in the root data from the model.
I don't know quite what is happening in lines 126-129 of lat.py:
if model_root_id in original_roots:
returned_roots[model_root_id].extend(original_roots[model_root_id])
returned_roots[model_root_id] = list(set(returned_roots[model_root_id]))
original_roots.update(returned_roots)
but it looks as if we end up with multiple options for various root_ids (i.e. original_roots[1] = "parent, parei". Replacing the above lines with the following (note that the first line is at the same ident level as 'original_roots.update(returned_roots)' in lat.py seems to fix this (though I don't know whether it breaks something else):
for model_root_id in returned_roots:
if model_root_id not in original_roots:
original_roots[model_root_id]=returned_roots[model_root_id]
However, there also seems to be a problem in collected.json data: under the model for 'infans', the ending data for the neuter singular nominative, voc, and acc (pos 37, 38, 39) is given as root ID 1, ending "-ns". The ending should be the same as the masc/ fem n/v sing, correctly give at pos 13/14 and 25/26 as root ID 4 and no ending (i.e. the canonical form). However, the output is either *parentns or *pareins for the neuter form, instead of expected parens.
This error seems to be in Collatinus itself: in the modeles.la file, the entry for infans reads:
modele:infans
pere:fortis
des:37-39:1:ns
des+:18,30,42:1:ē
des+:22,34,46:1:ŭm3
In Collatinus, des 37-39 refer to the neut sing n/v/a - either the root id should be 0 (remove two letters, then add ns) or 37-39 should be 4:K (I think)
EDIT: in the most up-to-date branch of Collatinus (the Medieval one), this error with infans has been corrected
Collatinus.Decliner produces incorrect results with words such as 'omnis' and 'parens'. This is a separate problem to the one reported at #1127 (problems with 'puer'), and is not solved by the changes to lat.py recommended there.
Python version: 3.9.13
CLTK version: CLTK 1.1.6
Windows 10
from cltk.morphology.lat import CollatinusDecliner
decliner = CollatinusDecliner()
print(decliner.decline("omnis",False,False))
We see the following erroneous output:
[('omnis', '--s---mn-'), ('omnis', '--s---mv-'), ('omnem', '--s---ma-'), ('omniem', '--s---ma-'), ('omnis', '--s---mg-'), ('omniis', '--s---mg-') etc.]
There is no such form as 'omniem' or 'omniis'. CollatinusDecliner has created both "omn-" and "omni-" as roots for the same root_id.
We should expect to see (and Collatinus gets this right):
[('omnis', '--s---mn-'), ('omnis', '--s---mv-'), ('omnem', '--s---ma-'), ('omnis', '--s---mg-'), etc.]
from cltk.morphology.lat import CollatinusDecliner
decliner = CollatinusDecliner()
print(decliner.decline("parens",False,False))
We see the following output:
[('parens', '--s---mn-'), ('parens', '--s---mv-'), ('pareiem', '--s---ma-'), ('parentem', '--s---ma-'), ('pareiis', '--s---mg-'), ('parentis', '--s---mg-')...
There are no forms 'pareie-'
We would expect to see:
[('parens', '--s---mn-'), ('parens', '--s---mv-'), ('parentem', '--s---ma-'), ('parentis', '--s---mg-')
Once again, CollatinusDecliner has created both "parent-" and "parei-" as roots
Certainly in the case of 'parens' there seems to be a problem in the "cltk_data\lat\model\lat_models_cltk\lemmata\collatinus\collected.json" files. The model for 'parens' is given as 'infans', and in the models section for 'infans' we see the following root info:
"infans": {"R": {"0": ["2", ""], "1": ["2", "i"], "2": ["2", "issim"], "4": ["K", null], "5": ["2", "i"]}
This cannot be right, as there is no circumstance in which we would remove two letters and replace them with an 'i' (which is what has happened here).
Oddly we find the same root info for 'fortis' which is the model for 'omnis' (and 'fortis' also declines incorrectly):
"fortis": {"R": {"0": ["2", ""], "1": ["2", "i"], "2": ["2", "issim"], "4": ["K", null], "5": ["2", "i"]}
The text was updated successfully, but these errors were encountered: