New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frontend for mupdf / kopt update #6738
base: master
Are you sure you want to change the base?
Conversation
bd39922
to
46a7686
Compare
Mhhh, many comments upcoming :) A) dunno how NimbusSans looks, and how it will change our feeling of the UI. Can you provide a few screenshots (menu with checkmarks and default/fallback symbols, wikipedia lookup result...) B) we have thought quite a bit about the order of our fallback fonts, and it might matter in various context (symbols used in the UI, chapter leading symbol in Wikipedia lookup results...) C) might need to check how many things look in the UI, like checkmarks in menu: a small shift in the font's line height or baseline might screw the good look we managed to have D1) I'm not fond of depending on MuPDF for the choice and update of fonts. I rather depend on @NiLuJe :) We might have in the past tweaked some font files - we would now not be able to do so as easily F1) We have a small set of fallback fonts that was good enough. FreeSerif has a lot of coverage and was good enough as the last one. G1) crengine would be fed all these fonts, which would be given back and listed in the Fonts> menu - which would be too crowded. We prevent that on Android, which has many such fonts (we have a toggle to use or not use them). koreader/frontend/document/credocument.lua Lines 33 to 56 in 6b41880
So, may be we could ship the fonts provided by MuPDF (@NiLuJe might know more if it's a good idea or not), but for how/what we use in frontend, it might be safer to just keep using what we have :) edit: |
Screens https://imgur.com/a/LFbLuoX Overall the little aesthetic difference with nimbus/charis there is I'd describe as "edgy 2000s, windows/android 10 era", whereas noto/freefont is more like "round 90s/android 6 era". I'm aware of the ordering and issues of fonts overriding symbols (in fact most of language-specific fallback noto do this, thats why they're always last and there's a need for high/low prio order). What's done here is technically not mandatory for UI, I left it in simply because it seems to work out of the box, but hardly an issue if people would like to keep noto sans+serif in bold/italic as well (can be kept in just for ui & cre, but mupdf won't have any of it unless specificly asked for as a family in epub). What I'm more worried about aesthetically is consistency. I'd prefer to have a single set of serif+sans fallback fonts (oldschool noto+freeserif, or newer nimbus+charis), instead of mixing both unless there's a good reason to. As for freesans, I'm somewhat tempted to coerce mupdf into it somehow but that can soon go full circle into having 2000+ line patch against mupdf once more (no, naive symlinking won't do - some glyphs are expected to be mapped during specific fallbacks!). It's either this nightmare again, or 10MB vs 2MB uncompressed so I'm opting to just keeping both.
I was somewhat confused by it. Of course we can add b/i variants for devanagari, hebrew and ... the 30 most commonly used scripts. It just struck me odd why arabic in particular, is there readability problem with line-pitched autobold renderer is ought to do in there (apart from being ugly, but so it is for all scripts)? |
@ezdiy A kind reminder, that FreeSerif supports more locales/regional variations of certain alphabets. Point in case, Noto Sans does not support what is colloquially known as Bulgarian Cyrillic, while FreeSerif does. For those interested, other varieties are Serbian and Macedonian, while standard Cyrillic follows, if memory serves well, the Russian reforms for a long time ago, which became mainstream. @poire-z On a separate note, do we have any way to check supported glyphs versus languages in an automated manner? I'm only thinking of the UI stuff from translations. |
@roshavagarga Freeserif is still a second choice right after latin based scripts. Noto should be relegated to a scope of far more exotic things. In mupdf it will select charis sil though (also serif). |
Nope we don't (but well, might as well let that to users that use these languages to report it :)
Somehow, the bold looks like the synthetized bold we get with xtext/harfbuzz (advances of the regular glyphs kept, just the shape are emboldened, so the space between glyphs is a bit constrained - but personally, I like it with our Noto Sans). Lines 30 to 32 in 6b41880
Well, I can't really get any impression with your screenshots :) Could you package a zip file with the new fonts, the updated font.lua (and any other things needed) that we could just unzip over a nighytly on our devices, to quickly get a feel of how it would look - so if it looks really nice, you'll gain more followers :)
Well, as a first step, I'd rather not touch UI and cre. And I don't care much about MuPDF, but it behaving more like upstream MuPDF is a target goal I guess (it currently behaving differently is not wished, just a side effects of shipping less fonts).
I don't know anything about nimbus/charis. Are they carefully and regularly updated like Google may care about Noto ?
It's just that we had more feedback/tests from Arabic users when we added support for RTL one year ago. As it was quite some work and we got nice results, we might as well add the necessary fonts (SansUI for the UI, Nasqh variants for CRE cause they prefer it for text), so it's perfect (and so the Bold, so it doesn't even have to be "a little ugly" on this side). |
The list of our fonts is just best left hardcoded. Technically we have like what, mere 10 faces? - charis serif, nimbus roman, nimbus sans, noto sans, noto serif, freesans, freeserif, han serif, droid sans, droid mono "serif". Listing files in that menu is of course barbaric, it must be grouped by typeface prefix.
Ah, that makes sense. Guess I'll just try to franken the fonts together and roll back ui to original and see how much bloat ensues. From what I can see it won't be all that bad, just few MB more compared to my original cut throat approach of just ensuring cmap coverage with little to no regard for the font itself. |
We don't list files. We give files to crengine, which sort/categorize them to put r/b/i/bi in font family slots, and we get back from it font family names, that we put in the menu (cf #4174 (comment), #6107 (comment))
OK, safer first step :) But I would still like to see how a Nimbused KOReader would look in real life:
|
Btw, does that mean that MuPDF, which if I remember right was making .o files from font files to have them statically linked, does not do that anymore and just ship its font files as real untouched font files? |
Will do with the franken attempt + put some toggle in there to flip between cut/uncut ordering. |
Soooo ... the whole story with mupdf caring about fonts starts kinda falling apart. I was wondering why its fonts in ui look kinda smudged, turns out they're distributing unhinted fonts. Which make.s kinda sense, it's a fallback after all. So might turn out out koreader will carry all the noto fonts it already has authoritatively - overriding select few mupdf overlaps with higher quality hinted ones. |
Yeah, that's a hard veto on touching the UI on my end ;). The symbols font also happens to be very specifically tailored for Noto Sans, and hand-tweaked in a couple places. And a softer veto on touching CRe, as, as was mentioned, that would bloat the font list, and mess with user's font choices (e.g., Charis is fairly popular, and often hand-tweaked). I also hate Nimbus' look with a fiery passion, which doesn't help ;). |
Ye, the affinity or fiery hate for halvetica is a very religious issue in typesetting ^_^
Yea figured as much in the end. Mupdf can be made mostly happy with franken mongrel of noto from koreader, though I'll try to keep em all as unhinted opentype and see how horrible it looks (faster to load, and 6MB->3MB and on high DPI e-ink i can't tell the difference, LCD it can be seen). The big guy in town here is CJK though. Koreader uses ancient Noto Sans CJK SC. This is good for menus, but not all that helpful for books. Whereas mupdf uses modern Source Han Serif which is great for documents, but awkward for UI. On one hand, it's nice to have both CJK sans and serif now, with mupdf running with fonts it expects. On another this results maybe +12MB bloat on release zip. Whether this is a reasonable default ultimately depends on how many cn/kr/jp users we have (I have no idea). The changes to font.lua are now nop for majority of scripts, except the odd balls that only noto knows. Only in those cases it's the noto fonts that mupdf brings. |
I generally don't have an issue with unhinted fonts (I'm running unhinted CRe settings on anything > 250DPI), but it might be more noticeable for UI stuff? (I mean, it certainly was on those low-dpi Nimbus screenshots ;p). The old Noto CJK is on purpose, newer ones have positioning issues with some punctuation marks, IIRC. IIRC, the last time we asked, Serif for CJK made zero practical sense, so we dropped the idea entirely. |
The current set of fonts is very much designed as a compromise of decent coverage without too much bloat, mainly aimed at UI and fallback. Actual Indic/Arabic/CJK users are expected to use their own preferred fonts for reading ;). |
Meaning I'd keep at the very least Noto Sans, and the existing UI variants hinted if they already are. FreeSerif/FreeSans is already CFF, I kept the old file extension to avoid duplicate files after an updates and because I couldn't be arsed to update the patch ;p. |
@NiLuJe Some are hinted (like the arabic ones), some aren't, some were even outright poorly compressed with unshared glyphs (i just look fc-cache output, not actualy glyphs so not really sure). The Noto CJK font should work with hopefuly only light messing with mupdf to look elsewhere. The point is to not make this daily occurence and end up with that insane patch it was before. |
I do indeed use my own font for Hebrew SBL hebrew. |
I'd possibly blacklist the bundled Charis from CRe, but this looks much better, yeah :). |
@NiLuJe I'm still not really a fan of gutting full language support to basic 8 scripts or so, and then slapping SC-cjk next to it that's 10x larger than all other languages combined, yet isn't actually the OTC CJK mupdf is expecting. As it stands, "weird" scripts (such as hangul) pdfs will open cause we fool mupdf by giving it bogus fonts, but will display wrong. But that's for another day and religious discussion - since the route is "screw what mupdf thinks is best" the defaults should be at least reasonable :> |
IIRC, we somewhat recently considered the OTC, and decided against it, I don't really remember why? Quite possibly as unnecessary? (ping @poire-z). |
I'm not really familiar with all these terminology.
Nope. We use the main font from the aliases (cfont, tfont...), and the ones from the hardcoded fallback font list.
The thing is that both crengine and our UI (via xtext) don't know much about languages and fonts, and don't decide to use an appropriate font per language. Just so you know: So, we expect fonts to be fine and not to dissilimar (regarding line height and baseline), and if they have opentype features, HarfBuzz may use them - if not, no worry, it' just a simple font. If buggy opentype features, we'll see the bugs. Having a collection of bugless fonts but with buggy opentype features won't do good to us :) |
This is odd. Family name should be same for "Noto Sans" and "Noto Sans Arabic" - both are just "Noto Sans", but each font announcing different languages. Edit: My bad, I worked with typesetting only in abstract, but don't know much about the harsh reality out there when dealing with freetype/harfbuzz directly. So it turns out fonts typically specify their face mutation as family - which is really just face. So in practice the selection algorithm peels from condensed -> bold -> italics -> family (noto sans arabic) -> superfamily (noto sans) -> superfamily (sans). This indicates that "real families" are just hardcoded somewhere, and "font_family" field in the font is just matched to that and categorized. Note that font selection is hot mess everywhere, and usually works something like this:
Just yolo selecting fonts on single line by glyph is normal, though the heuristics tend to be more involved. First, it's to deal with stuff like some silly font putting in fallback glyphs (rectangle) an such. But more importantly, it plays mayhem with glyph positioning the more you distance yourself in terms of family, which is why the algorithm tries real hard to stay within one. The algorithm also gets supremely confused when you give it "amalgam" fonts, but it depends how it implements face selection internally - some can discern the internal cmap tables of amalgams, but most don't. |
(Not sure which you mean by "The algorithm..." here and there - ours, or any other good one :) Anyway, FreeType is quite limited in the info it gives us (we can't know if a font is Sans or Serif, Noto Sans and Noto Sans Arabic are different families - and we would have to do some heuristics on the font family name strings to classify them furthermore) - and we don't want to bring fontconfig to the party :) But I was just stating how our current algorithms (quite similar in crengine and in the UI) work currently, so you know what we expect from/can do with fonts (and don't go assuming they work differently and expect different things from fonts - or that we could easily adapt said algorithms to comply to changes in fonts). |
One ordinarily used to select fonts for typessetting to get font access on a high level (web browsers). This stuff is usually abstracted by the likes of fontconfig, so I never realized what horrors lie behind.
It's much simpler, neither UI nor crengine can deal with arbitrary supplied fonts, they must be given only extremely limited subset to work consistently (I guess this explains the massive font blacklist of system fonts :). On that note, I'm not sure if loading system fonts is a wise idea at all if there's no effort to actually select said fonts. |
That's for Kindle only - may be because the fonts are obfuscated, or have some twisted hinting or something (@NiLuJe ?)
It's an option on Android: a user can select and use Noto Sans Balinese that way :) or other user font he has set globally on Android (dunno if that's possible). |
It says in the comment. ;-) (So a little bit of both I'd say.) koreader/frontend/fontlist.lua Lines 9 to 11 in c7f77de
|
Yes, manually. It will render Balinese fine then, but will fall back to builtin koreader fonts for anything else. Thats why the heuristic of font selector is done in the first place - user (or document) selects "one font", but what that does really is set a wide filter for all available fonts. |
Yes, but I'm still just describing what we can do and do :) and what a user can do to help himeself with how our stuff work - and why, thus not perfect, it is good enough: |
@poire-z I'm quite fine with the approach, as the simplicity is probably worth it (for small amount of fonts). Doesn't make sense to impose the limit anywhere else though (ie mupdf can be given all fonts just fine and its own crappy heuristic can do something with it). It makes sharing the fonts between the two kinda awkward though, see the most recent patch here. |
Apart from some minor issues, this thing should be now buildable/useable for common devices (ie not osx, not android :) You may need to clone from scratch due to multiple stomping on git submodules. |
8723f91
to
7320e65
Compare
* Make UI font handling file-suffix agnostic * PDF use new annotations API
Pinging @ezdiy :( |
Side remark: Ever thought of using the Kurinto font families? Licensing is SIL OFL, and they have an excellent Unicode coverage. I typically use the "Main" variant of "Kurinto Text", "Kurinto Sans" and "Kurinto Mono" (Narrow), and—if needed—the CJK variants. Also a great plus: All faces and scripts can be intermixed freely without side effects like baseline shift or the like. |
Not quite the right place to discuss this ;). And, as usual with fonts, the issue is package size. Given the size of even their "lite" package: that's probably a nope. (Plus, Noto has the advantage of matching our stock UI font). |
Relating to:
koreader/koreader-base#1203 and https://github.com/ezdiy/koreader-fonts
This is a starting point of a "branch" that builds koreader with new mupdf.
Regarding font stuff:
The idea here is to delegate to mupdf for near-universal set of fallback fonts. But why? New mupdf comes with no brutalist font rerouting the old patch used to do. Meaning fallback noto sans fonts for exotic languages are now expected to actually exist. This isn't as heavy as it looks (adds about 5M, almost nothing compared to ginormous CJK).Too much rocking of the boat. The approach taken now is make mupdf try use our own fonts, and complain about it in log, so that one could troubleshoot what's wrong (mupdf tends to happpily fall on its face as it expects builtin fonts to be well .. builtin). This means things need to be fixed up on case-by-case basis, but we can avoid the bloat of carrying all fonts.
This change is