Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two or more capitals #214

Open
KariRudjord opened this issue Jan 15, 2019 · 19 comments
Open

Two or more capitals #214

KariRudjord opened this issue Jan 15, 2019 · 19 comments

Comments

@KariRudjord
Copy link

Can we implement this new OUP rule for sequences of captial letters (not acronyms)?

When two or more capitals appears in text - in headlines, the first three words in a chapter etc. - use 3 x dot 6 (6-6-6) before (Oh, scary!) and 1 x dot 6 (6) after. This shall include space so that there is only one (6-6-6) in front of the capital words and one (6) after the last word with capitals.

bilde

(Today there are 2 x dot 6 (6-6) before capitals, and the (6-6) is repeatet after every space.)

@josteinaj
Copy link
Member

@bertfrees do you think this is feasible in liblouis or should we just do it in our block-translate.xsl?

@bertfrees
Copy link
Collaborator

Should be feasible yes. We can start with some Liblouis YAML tests in any case. Can one of you do that?

@josteinaj
Copy link
Member

Great. We'll create some tests.

@KariRudjord: so this is only for headlines but not in normal paragraphs? Is the following correct?

Headlines


Headline: THIS IS A HEADLINE WITH SEVERAL WORDS

Braille: ⠠⠠⠠⠞⠓⠊⠎⠀⠊⠎⠀⠁⠀⠓⠑⠁⠙⠇⠊⠝⠑⠀⠺⠊⠞⠓⠀⠎⠑⠧⠑⠗⠁⠇⠀⠺⠕⠗⠙⠎⠠


Headline: This IS A HEADLINE with several words

Braille: ⠠⠞⠓⠊⠎⠀⠠⠠⠠⠊⠎⠀⠁⠀⠓⠑⠁⠙⠇⠊⠝⠑⠠⠀⠺⠊⠞⠓⠀⠎⠑⠧⠑⠗⠁⠇⠀⠺⠕⠗⠙⠎


Headline: This is a headline with several words

Braille: ⠠⠞⠓⠊⠎⠀⠊⠎⠀⠁⠀⠓⠑⠁⠙⠇⠊⠝⠑⠀⠺⠊⠞⠓⠀⠎⠑⠧⠑⠗⠁⠇⠀⠺⠕⠗⠙⠎



Paragraphs


Paragraph: THIS IS A PARAGRAPH WITH SEVERAL WORDS

Braille: ⠠⠠⠞⠓⠊⠎⠀⠠⠠⠊⠎⠀⠠⠁⠀⠠⠠⠏⠁⠗⠁⠛⠗⠁⠏⠓⠀⠠⠠⠺⠊⠞⠓⠀⠠⠠⠎⠑⠧⠑⠗⠁⠇⠀⠠⠠⠺⠕⠗⠙⠎


Paragraph: This IS A PARAGRAPH with several words

Braille: ⠠⠞⠓⠊⠎⠀⠠⠠⠊⠎⠀⠠⠁⠀⠠⠠⠏⠁⠗⠁⠛⠗⠁⠏⠓⠀⠺⠊⠞⠓⠀⠎⠑⠧⠑⠗⠁⠇⠀⠺⠕⠗⠙⠎


Paragraph: This is a paragraph with several words

Braille: ⠠⠞⠓⠊⠎⠀⠊⠎⠀⠁⠀⠏⠁⠗⠁⠛⠗⠁⠏⠓⠀⠺⠊⠞⠓⠀⠎⠑⠧⠑⠗⠁⠇⠀⠺⠕⠗⠙⠎


@KariRudjord
Copy link
Author

It is in all kinds of text, also paragraphs. And it would save me a lot of meaningless work :-)

@josteinaj
Copy link
Member

How about punctuation, numbers and other characters? Do they affect this? Can there be multiple sentences between ⠠⠠⠠ and ?

@KariRudjord
Copy link
Author

A whole paragraph or a group of paragraphs can appear in capital letters. Then the group of paragraphes are surronded by only one ... before and . after (the one after comes in addition to a ordinary punctuation). This should not be stopped by a number or other sign inside the text.

@josteinaj
Copy link
Member

Ok. I'll update the tests in liblouis. Does these look right?

  • Input: Jeg bare MÅ snakke med deg!
  • Output: ⠠⠚⠑⠛ ⠃⠁⠗⠑ ⠠⠠⠍⠡ ⠎⠝⠁⠅⠅⠑ ⠍⠑⠙ ⠙⠑⠛⠖

  • Input: DEN BESTE BURSDAGEN jeg har hatt var da jeg fylte åtte.
  • Output: ⠠⠠⠠⠙⠑⠝ ⠃⠑⠎⠞⠑ ⠃⠥⠗⠎⠙⠁⠛⠑⠝⠠ ⠚⠑⠛ ⠓⠁⠗ ⠓⠁⠞⠞ ⠧⠁⠗ ⠙⠁ ⠚⠑⠛ ⠋⠽⠇⠞⠑ ⠡⠞⠞⠑⠄

  • Input: EN ROSE MED TORNER
  • Output: ⠠⠠⠠⠑⠝ ⠗⠕⠎⠑ ⠍⠑⠙ ⠞⠕⠗⠝⠑⠗⠠

  • Input: EN ROSE! MED 15 TORNER.
  • Output: ⠠⠠⠠⠑⠝⠀⠗⠕⠎⠑⠖⠀⠍⠑⠙⠀⠼⠁⠑⠀⠞⠕⠗⠝⠑⠗⠄

  • Input: SAK 27/07 SØKNAD OM ØKONOMISK STØTTE FOR 2007
  • Output: ⠠⠠⠠⠎⠁⠅ ⠼⠃⠛⠌⠼⠚⠛ ⠎⠪⠅⠝⠁⠙ ⠕⠍ ⠪⠅⠕⠝⠕⠍⠊⠎⠅ ⠎⠞⠪⠞⠞⠑ ⠋⠕⠗ ⠼⠃⠚⠚⠛⠠

@josteinaj
Copy link
Member

@bertfrees: liblouis/liblouis#687

@KariRudjord
Copy link
Author

EN ROSE! MED 15 TORNER.
It should be 2 x dot 6 at the end of the sentence. The first to close the capital letter thing, the other a punctuation.

If it was: En ROSE! MED 15 TORNER? it should be the same in the end, first dot 6 to close the capital letter thing, then a 26 (question mark)

@KariRudjord
Copy link
Author

Didn't meen to close ...

@josteinaj
Copy link
Member

Ok, so:

  • Input: EN ROSE! MED 15 TORNER.
  • Output: ⠠⠠⠠⠑⠝⠀⠗⠕⠎⠑⠖⠀⠍⠑⠙⠀⠼⠁⠑⠀⠞⠕⠗⠝⠑⠗⠠⠠

  • Input: EN ROSE! MED 15 TORNER?
  • Output: ⠠⠠⠠⠑⠝⠀⠗⠕⠎⠑⠖⠀⠍⠑⠙⠀⠼⠁⠑⠀⠞⠕⠗⠝⠑⠗⠠⠢

Right?

@josteinaj
Copy link
Member

@KariRudjord also, how about this case:

SAK 27/07 SØKNAD OM ØKONOMISK STØTTE FOR 2007

Should the dot 6 be after "2007" or after "FOR"?

@josteinaj
Copy link
Member

@KariRudjord also: has OUP documented this rule anywhere?

@KariRudjord
Copy link
Author

ROSE = correct
SAK = dot after 2007 (in en of sentence)

@KariRudjord
Copy link
Author

To eller flere ord med store bokstaver-pf.pdf

@josteinaj
Copy link
Member

@bertfrees liblouis/braille-specs#7

@bertfrees
Copy link
Collaborator

EN ROSE! MED 15 TORNER.
It should be 2 x dot 6 at the end of the sentence. The first to close the capital letter thing, the other a punctuation.

If it was: En ROSE! MED 15 TORNER? it should be the same in the end, first dot 6 to close the capital letter thing, then a 26 (question mark)

Shouldn't it be dot 3 for the punctuation??

SAK = dot after 2007 (in en of sentence)

What do you mean with "in en(d) of sentence"?

I can't find more details about these cases in the PDF. I wonder if much thought has been put into this? I find it a bit illogical that the number (2007) at the end of the capital block is included (closing mark comes after it), but the punctuation (. or ?) is not included (closing mark comes before it) even though it is more "connected" to the capital block because there is no space between it and the last word in capitals (TORNER).

@josteinaj josteinaj added M Medium size job Punktskrift Braille labels Jan 7, 2020
@josteinaj
Copy link
Member

Fixed in liblouis repo on August 25th: liblouis/liblouis@fe5ce6f. Will presumably be included in the next release of liblouis, and when NLB has moved from NLB-PIP to DAISY-PIP, we'll be able to use this change.

@josteinaj
Copy link
Member

Shouldn't it be dot 3 for the punctuation??

Yes you're probably right. Will add a fix to the PR branch here: liblouis/liblouis#687

I see here that @KariRudjord points out that the "end of capital letters string" marker should be after the year in the string "SAK 27/07 SØKNAD OM ØKONOMISK STØTTE FOR 2007". I don't know how important this and maybe we'll find more corner cases where this shouldn't apply if we try fixing it in liblouis? So I'd say that the current behavior with the marker after "FOR" is fine for now, and then we can do some testing to see how it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
PIP - further improvements 2021
Waiting for clarification
Development

No branches or pull requests

4 participants