New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regnal names/epithets #802
Comments
Titles and other miscellaneous nominal constructions are a sore spot in general, e.g. #757. |
Well that's messy... |
Using |
While having no particular language expertise, using |
Among the three proposals, as already expressed in other discussions over the past months I think that The first problem I see to better understand the case is that UPOS here is uniformly UD table
That is, also independently from the chosen UPOSs, these attributes are expressed as expressions with a verbal (participial) or adjectival head, or the like, anyway attributively. The agreement in instrumental, masculine and singular seems to signal that. I do not see any particular "horizontal" relation like By the way, out of curiousity, how is a regnal name defined in literature? In general, I would take "epithet" as a cover term for a kind of attributive ( |
Right, this is exactly the issue I was thinking but had not been able to articulate @Stormur, thanks for the input. Good point on the POS tagging/deprels of the morphemes, I think that is the better way to do it rather than pure PROPN. On the deprel of the nominals: Ashokan Prakrit, like Sanskrit, hardly differentiates the categories of adjective and noun, and so a list of nominals like this syntactically behaves the same as an adjective + adjective + noun construction--all elements agree in case/number/gender with the final element. So it certainly cannot be a headless construction as necessitated by I agree now that The argument for In that sense maybe you are in the right direction with the @nschneid and @amir-zeldes's recent paper (https://arxiv.org/pdf/2108.12928.pdf) has some discussion on titles for English which are relevant here. I like the suggestion of There's no good linguistic discussion of epithets/regnal names in Sanskrit or MIA that I can find, I just picked a useful descriptive name for this phenomenon. Generally only Indologists study this kind of stuff and they are more interested in the semantics or history of these terms rather than the syntax. (And admittedly, no one is really working on synchronic Middle Indo-Aryan linguistics as of the past decade.) |
The draft compares a lot of different options, but my favorite would be to fold this under the suggested For the ADJ/NOUN issue, we've run into this in several languages, and as best as I can see the only way to maintain the distinction is to talk about nominals with inherent gender (so maybe there is "priestess" and "priest", but those are distinguished by derivation, not inflection, and each has its own inflection), vs. flexible gender, which is the ADJ class. In some languages, the "flexible" words are understood to be nominalized into NOUNs in context, and only attributive and predicative contexts are treated as ADJ, while argument filler instances are treated as NOUN. In languages where such conversions are rare, like English, the UD guidelines to keep them as ADJ are applied more liberally (e.g. "the poor", which also stands out by not having a plural inflection). |
This issue comes up very often in Latin (and not only there), too, and touches upon some fundamental questions of annotation. I'll try to explain my point of view very briefly:
So, if you say that priya is usually an adjective, I think it should stay like that, especially if it keeps a referential function (in this case to rāña or a similar intended entity). Full substantivisation is a further step and entails e.g. that the adjectival element becomes fixated with the same gender independently from context. It is not what is happening here, at least not in this expression. That, in my opinion, rules out |
Thanks for the input @amir-zeldes and @Stormur, really clears up the problem on Would a subtype |
In the interest of not proliferating too many labels, we've only advocated for subtyping the nmod case. But actually are you sure in this specific case you don't want priya- to be If so, then I think |
@amir-zeldes well, not necessarily. It is true that the caseless stem form only occurs in non-final morphemes in compounds but the UD deprel The Sanskrit grammatical tradition has a very comprehensive analysis of compound types (in fact, it would be pretty neat to try to map that to UD relations.) Ashokan priya-dasin (< Skt. priya-darśin) is a karmadʰāraya compound, i.e. it can be rephrased as a nominative case modifier priyo dasin (Skt. priyaḣ darśin) "loving looker" where the adjective is pretty clearly describing the noun. I think there's a bit of computational work on Sanskrit compound parsing by Amba Kulkarni but it has not been explored in UD, only kāraka-style dependency formalisms which were popular in Indian comp ling for a while. |
OK, so it sounds like "yes, they are the typical Indo-European compounding stems", but there is a tradition that sub-types them based on the category of the stem and the semantic roles, right? I think in Germanic languages the tradition has been to call these "compounds" regardless of the constituent properties (so in German, we speak of A+N compounds, but still they are called compound). Recently this was discussed for English "hot dog": And then continued here: I think if normal adjectival modification looks distinct from the stem+noun construction (unlike in English), it maybe makes sense not to call this As a side note, Latin and Greek UD TBs do not tokenize such modifiers at all, so there is more inconsistency there. |
Thinking about this a bit more, @Stormur noted that in Latin an
Right, exactly, and this seems to be the convention retained up by the Sanskrit dependency corpora so it would make sense for Ashokan Prakrit (and other future Middle Indo-Aryan) corpora to stick to that. But I'm not sure how far this convention should go, since it seems to start infringing into semantic territory with e.g. deva-putra "son of a God" which would be deprel'd with It seems |
This double seems a bit redundant to me, in that an amod is usually some kind of "description". But maybe some specific label for regal titles might be envisioned?
This feature is already covered at the morphological level with the combination |
This is in fact an issue that will need to be discussed (and that I have on my notebook). At the moment, Latin is using the Now, we see different cases here. While it is more or less obvious that it does not make sense to split princeps (morphologically and lexically) or sicut (functionally), even if we can pursue their etymological iter, a case can be made for animadverto. But Latin is not very satisfying in this sense, as there are no clear, large-scale compounding strategies as in Greek ,Sanskrit or German. There, the argument for splitting is much stronger than for animadverto: you have systematic methods and morphological instruments that are regularly applied and can simply be seen as "fusive counterparts" of more analytic strategies applied by other languages. So, if we have e.g. γεωλογία in Greek, Latin would refer to it as scientia terrae, and so on. Morphologically different, but syntactically equivalent. |
Maybe I repeat myself, but I'd like to point out that what we call "nominalisation" or "substantivisation" of an adjective most of the time actually just refers to a possible variation in syntactical variation and does not really imply a change at the "lexeme level" (pardon me if I am not using too technical, or vague expressions here). So I would even refrain to say that an adjective is "nominalised", and just note that it can be the head of an NP (whereas a language like English usually requires a dummy element like one), where however some head is implied. Now, I was asking about definitions of regnal names, or epithets because I think I see a small confusion: some attributes associated to the names of kings & co. become standardised and then can have a sort of life of their own; this probably makes us reanalyse them as independent, thus favouring an annotation as |
I think the tension in these analyses is about what's more important: the The individual relations analysis (modifiers as For languages like English, where compounds are spelled apart and there is little formal difference between a compound modifier and an independent word, I can see how it is more tempting to choose the second option and use multiple types (currently, For languages in which there is a clear difference between compound modifier and independent word modifiers, this is less tempting, because using the normal relation (amod, obj etc.) clashes with the expected form of a word in these roles (e.g. accusative marking on an obj). The way to identify that something is an obj inside such a compound without marking is primarily semantic, not syntactic. It also creates a problem when the modifier has very ad hoc semantics, such as Downing's (1977) "apple juice seat" meaning "seat in Finally if the language has a tradition of spelling compounds together, like German or Greek, then many TBs will simply not tokenize the compound components, and avoid having to make this decision. I'm not sure if "Universal" UD guidelines will be helpful here, since traditions clearly differ across languages, and it's probably not realistic to revise data to be consistent across so many datasets... But I am curious what @dan-zeman and other people who are involved in guidelines across languages think about this tension. |
I looked around the corpus and there are instances where "king" in the epithet gets dropped, e.g. in the edition of this sentence in the edict at Kalsi: iyaṁ dʰaṁma-lipi Devānaṁpiyēna Piyadas[i]nā [lēkʰit]ā (some parts hard to read but this is the consensus reading). That makes me want to treat the titles as nominals with |
It is difficult to say anything sufficiently general and cross-linguistically valid. I might prefer For languages like German, similar decision may have to be made in the enhanced representation under the emerging proposal from Dagstuhl on optional compound splitting. |
I and @AdamFarris have been annotating some Aśokan Prakrit texts over in the UD_Prakrit-DIPI repo. (These were inscriptions commissioned by the Mauryan king Aśoka a long time ago and represent the earliest written stage of Middle Indo-Aryan after Sanskrit.)
One issue that has come up is how to deal with Aśoka's regnal names: Devānaṃ-priyena Priya-dasinā rāña "beloved-of-the-gods looking-with-kindness King" (note that each nominal here is in instrumental case). Sanskrit nominal compounds like this are always headed by the last nominal, so currently we have this (using
Compound=Yes
for non-declined parts of compounds like priya, like UD_Sanskrit-UFAL does per #539):UD table
The issue is a couple of different dependency relations between the various elements are possible here:
appos
(I'm sure the title orders could be flipped here and it's fine as long as rāña "king" is last, so that fits this)compound
flat:name
(but these aren't really part of his name, they're special titles that have also been used by other kings, e.g. Devanampiya Tissa of Anuradhapura).Uncertain about which one is best. (Also don't think this issue is Aśokan-specific, hence put it here.)
The text was updated successfully, but these errors were encountered: