Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

display of equations and super and subscripts. #41

Open
LucasHorseshoeBend opened this issue May 11, 2017 · 12 comments
Open

display of equations and super and subscripts. #41

LucasHorseshoeBend opened this issue May 11, 2017 · 12 comments
Assignees
Labels
Ingestion presentation Relates to how information should be presented in the website

Comments

@LucasHorseshoeBend
Copy link
Collaborator

LucasHorseshoeBend commented May 11, 2017

See as an example footnote 2 in
http://vmcp.conaltuohy.com/xtf/view?docId=tei/Mentions/1860-9/M64-02-22-draft.xml

What is displayed at the end of the line should be in the form of a fraction, 51 over 9187.
There are other examples. In this case and some others it could probably be writen as "51/9187", but that is less representative of the document.

In
http://vmcp.conaltuohy.com/xtf/view?docId=tei/Mueller%20letters/1860-9/1864/64-02-00a-final.xml
we have used an equation format to display a T with a bar over it as a fraction, which is a case where the form "/T" does not represent the chemical sybolism at all. This same line should have the first numeral 3 as a superscript and the second as a subscript.

Any ideas?

@Conal-Tuohy
Copy link
Owner

I take it the issue with the first example is that you wish to capture the typographical nature of the fraction (i.e. that it is "upright", with the numerator above and the denominator below a horizontal bar).

My feeling is the simplest way to do this would be to encode it as a solidus fraction, but style it to indicate that it was rendered in an upright form. In TEI, this might look like:
<seg rend="upright">51/9187</seg>. So in the word file, encode the text as 51/9187, and apply a character style of "upright", and I will change the conversion pipeline to convert it to the above TEI. The display system would also need a tweak in order to correctly render the text.

In the second case, probably the most "correct" semantic encoding is to use a "combining macron" character in combination with the T. The combining macron is a character which sticks to the character which it follows, combining to effectively form a single character. Try this: T̄.

In the event of any problems, I would just format the T with the "overbar" (or whatever it's called) character formatting, and I can easily add a stage to the conversion pipeline to replace those with the combining macron character.

The subscript and superscript are correctly encoded in the Word file; however, the formatting is not being captured in the TEI conversion. I will need to fix this in the conversion script. I think in fact this is the same bug as #3 and #9 and #34.

@LucasHorseshoeBend
Copy link
Collaborator Author

Thanks. Your interpretation is correct. These are cases where the typography is important for the logic.
I will try your suggestions, for which thanks. I hope to be able to do so before the next run at 18:00 for the combining macron, by inserting your character, and encoding the fraction.
We will wait and see how you get on coding the sub and super-scripting.

@LucasHorseshoeBend
Copy link
Collaborator Author

One step forward and one step back.
The T now shows as T̄ in http://vmcp.conaltuohy.com/xtf/view?docId=tei/1860-9/1864/64-02-00a-proofed.xml (while we are playing with this I have set the file name back to proofed, so the link is how it will appear after the midnight update). But with the version of Word I am using it doesn't display correctly in that format. That may not be a long-term problem, depending on what is finally decided for downloads as _pdf_s.

So we can easily see what happens at the moment I have side by side in the text both the work-around, which displays OK in my Word, and the combining macron T, which doesn't. I will try your alternate suggestion of creating a character style toward the end of next week after we get back from some time in London archives.

@Conal-Tuohy
Copy link
Owner

Looking at the issue of T with an overbar or macron again, I can't see where we are up to with this. I can't actually see the character used in the file. http://vmcp.conaltuohy.com/xtf/view?docId=tei/Mueller%20letters/1860-9/1864/64-02-00a-final.xml;chunk.id=main;toc.depth=1;toc.id=;brand=default

Perhaps it would be simpler to work with a sample document, in the "quarantine" folder?

If I understand it correctly, it was possible to insert a T with either a combining macron or a combining overbar, i.e. T̄ or T̅ into the Word file, though it didn't display correctly in Word, it did end up OK on the website; is that correct? Is it still the case that it doesn't display in your current version of Word?

@LucasHorseshoeBend
Copy link
Collaborator Author

LucasHorseshoeBend commented Feb 19, 2021 via email

@LucasHorseshoeBend
Copy link
Collaborator Author

I forgot in Friday to respond to the other sort of equations, such as that displayed as an asterisk in fn 3 of
http://vmcp.conaltuohy.com/xtf/view?docId=tei/Mueller letters/Mentions/Selected Mentions letters/M64-02-22-final.xml.
The screen shot attached shows that it is an embedded object in Word.
Can you identify as a facet the files where an embedded object exists (it will also pick up files with images)? I think that we can most easily solve the problem by writing the file reference number, which is what this is, as in this case, either as "51 over 9187" or as "51/9187" I will discuss with Rod. Using an image is really a bit of overkill to faithful rendition, whereas the information can be conveyed in an alternate way.
Screenshot 2021-02-21 at 11.17.pdf
.

@Conal-Tuohy
Copy link
Owner

Conal-Tuohy commented Feb 25, 2021

I would avoid using images for fractions; what I'd suggest for upright fractions would be to encode the numerator and denominator as text, with a solidus separator, e.g. 51/9187 and then select the entire fraction and format it with a character style called upright. The pipeline can then recognise the upright style, and convert the fraction into equivalent TEI markup, and finally we can display it in the HTML in the desired form (i.e. as an actual upright fraction). If you could create a document in the "Quarantine and problematic" folder with such a fraction, and let me know, I can do the rest. It will be easy.

@Conal-Tuohy Conal-Tuohy added Ingestion presentation Relates to how information should be presented in the website labels Feb 25, 2021
@LucasHorseshoeBend
Copy link
Collaborator Author

I will try that with the example file.
There will be the problem of identifying the files concerned.
Can you select as a facet those files that have the unresplved "objects"? This will at least for now include those with the drawings, but I have a list of those, so could identify the other problem files by elimination.

@LucasHorseshoeBend
Copy link
Collaborator Author

I have placed a test file in quarantine folder: 21-10-25.doc with correspondent line Test file for upright fractions

@LucasHorseshoeBend
Copy link
Collaborator Author

I have discussed the "upright" issue with Rod. He thinks that there are likely to be inconsistencies in the way these file registry annotations were transcribed, with a large number of them of the form 51/9187.

So it will be better to leave them like that, as it will be impossible to distinguish such cases without going back to the holding archive.

So all we need to be able to do that is to identify files with embedded objects! Can it be done by creating a facet?

@LucasHorseshoeBend
Copy link
Collaborator Author

I have found a satisfactory symbol to represent the T with the overbar.
Remaining issue is the representation of super and subscripts, see test file in quarantine folder:
http://vmcp.conaltuohy.com/xtf/view?docId=tei/Mueller letters/Quarantine folder for problem files/Test File re sub and superscripts.xml
I have created a new issue #50 for the discovery of embedded objects to separate out the distinct issues.

@LucasHorseshoeBend
Copy link
Collaborator Author

I will need to check this in XProc version, but I think we have handled this editorially.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingestion presentation Relates to how information should be presented in the website
Projects
None yet
Development

No branches or pull requests

2 participants