Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer alignment IDs for sentences and divs #5

Open
mlj opened this issue Jul 6, 2017 · 2 comments
Open

Infer alignment IDs for sentences and divs #5

mlj opened this issue Jul 6, 2017 · 2 comments
Assignees

Comments

@mlj
Copy link
Member

mlj commented Jul 6, 2017

Currently only objects whose alignment IDs are set explicitly upstream (for whatever reason) have their alignment IDs set in PROIEL XML. This behaviour is not obvious to end users who may expect to find alignment indicated on all objects. Given that we can easily infer alignment of sentences and div elements from token alignments, we should consider adding them in post-processing or having upstream fill them in automatically.

@mlj mlj added the enhancement label Jul 6, 2017
@mlj mlj assigned mlj and daghaug Jul 6, 2017
@mlj
Copy link
Member Author

mlj commented Jul 8, 2017

Divs now have inferred alignments whenever necessary. This leaves only a few divs without any alignment IDs. In latin-nt we are missing some from unannotated parts but also some annotated parts. We should look into why this has happened:

    <div id="542">
    <div id="543">
    <div id="545">
    <div id="548">
    <div id="549">
    <div id="551">
    <div id="553">
    <div id="554">
    <div id="555">
    <div id="556">
    <div id="558">
    <div id="559">
    <div id="561">
    <div id="563">
    <div id="564">
    <div id="565">
    <div id="566">
    <div id="568">
    <div id="569">
    <div id="570">
    <div id="571">
    <div id="572">
    <div id="591">

The other NTs are complete except that the Gothic NT has incipits and explicits that of course are unaligned. (In Marianus these are unannotated and therefore do not appear in releases.) We should consider merging such divs with the actual text divs to eliminate.

    <div id="693">
    <div id="710">
    <div id="711">
    <div id="755">
    <div id="770">
    <div id="771">
    <div id="785">
    <div id="786">
    <div id="793">
    <div id="794">
    <div id="801">
    <div id="810">
    <div id="815">
    <div id="816">
    <div id="820">
    <div id="821">
    <div id="828">
    <div id="833">

@mlj
Copy link
Member Author

mlj commented Apr 8, 2018

This is now included in release 20180408.

Keeping this open until we have all issues with alignment-id on div solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants