Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: use of @font-face #2

Open
mgieseki opened this issue Jul 20, 2014 · 27 comments
Open

feature request: use of @font-face #2

mgieseki opened this issue Jul 20, 2014 · 27 comments
Assignees
Labels
feature feature request

Comments

@mgieseki
Copy link
Owner

Hi, this is not a bug, but a feature request. However, I did not find a special forum for posting feature requests, so I put this here...

I would like to request that future versions of dvisvgm support the @font-face feature of CSS (http://www.w3.org/TR/2008/REC-CSS2-20080411/fonts.html#font-descriptions) to embed a link to the original OpenType / Type1 / whatever fonts into the generated SVG files alongside the embedded <font> element.

The idea would be that

\documentclass{article}
\usepackage{lmodern}
\begin{document}
Hallo Welt.
\end{document}

produces something like this:

<?xml version='1.0' encoding='ISO-8859-1'?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- This file was generated by dvisvgm 1.2.2 (x86_64-apple-darwin10.8.0) -->
<!-- Fri Aug 23 12:45:33 2013 -->
<svg height='630.635pt' version='1.1' viewBox='-46.3263 27.9508 405.479 630.635' width='405.479pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'>
<defs>
<font horiz-adv-x='0' id='rm-lmr10'>
<font-face ascent='1127' descent='-290' font-family='rm-lmr10' units-per-em='1000'/>
<missing-glyph d=''/>
<glyph d='M192 53...' glyph-name='period' horiz-adv-x='278' unicode='.'/>
<glyph d='M419 0...' glyph-name='one' horiz-adv-x='500' unicode='1'/>
...
</font>
</defs>
<style type='text/css'><![CDATA[
@font-face {
  font-family:rm-lmr10;
  src: url("lmroman10-regular.otf");
}
text.f0 {font-family:rm-lmr10;font-size:10px}
]]>
</style>
<g id='page1' transform='matrix(0.996264 0 0 0.996264 0 0)'>
<text class='f0' x='77' y='63'>Hallo<tspan x='103.389'>W</tspan>
<tspan x='112.833'>elt.</tspan>
<tspan x='232' y='633'>1</tspan>
</text>
</g>
</svg>

Note the added @font-face. At least on Safari, this causes the fonts to be rendered using the real OpenType font, if the font is available at the specified URL and, if not, the embedded font is used as a fallback.

I know this is kind of tough to implement since it is hard to "tell" where the actual fonts are and it is not clear what the URLs should look like (absolute? relative? should the fonts be copied?). However, if one wishes to use SVG as a replacement for PDF during a presentation or for longer text to be read online, high-quality fonts would be a real plus.

Launchpad Details: #LP1215875 Till Tantau - 2013-08-23 12:52:29 +0200

@mgieseki
Copy link
Owner Author

Thanks for your suggestion. I have to think about it a bit more before I assess whether it works in a generic way.
Especially, one problem might be the fact that there's not always a simple relation between the DVI character code and the corresponding UTF-8 code point. All recent versions of dvisvgm create SVG files that use the DVI char codes to mark and reference the characters. Thus, if some of them differ from the unicode points, you get wrong results when directly accessing the font files.
However, I'm currently switching the code to correct UTF-8 encoding which requires a lot of font table magic. I'll see if an acceptable support of @font-face descriptions is possible.

Launchpad Details: #LPC Martin Gieseking - 2013-08-23 15:02:50 +0200

@pnsaevik
Copy link

This would indeed be an awesome addition. Especially since it allows us to use tex directly to produce vector images of formulas, which could be embedded on a web page.

The real challenge would be unencoded glyphs, such as big math operators, which cannot be accessed directly from SVG. The easiest way of solving this problem is to create a slightly modified copy of the font, where all the glyphs are encoded (possibly with unused glyphs removed). The font license could be a problem, but I guess most math fonts are free to use and modify.

@thomas001
Copy link

Hi, I just started playing with dvisvgm and came across the same idea as presented here.
Is the problem with dvi character codes vs unicode also present when using fontspec?
@pnsaevik why not just keep glyphs with no direct correspondence to a glyph in the font file as svg paths as dvisvgm -n does?

@pnsaevik
Copy link

@thomas001 You might be interested in the tex.sx question http://tex.stackexchange.com/questions/282340/how-to-create-non-outlined-svg-files-from-latex-formulae where I and Martin discuss the matter.

Some main points:

  • Using dvisvgm -n works well enough. The drawback is that the browser must load each path individually, instead of using a native font file where glyphs are defined once and for all. Besides, rendering from a native font file is both faster and produces better quality than rendering from svg paths.
  • The dvi encoding depends on whether the engine is XeTeX or LuaTeX. The former enumerates all glyphs according to the glyph position in the font file, the latter enumerates glyphs using unicode mappings. If LuaTeX needs some glyphs that are not unicode mapped, it maps the glyphs to the PUA unicode area.
  • Glyphs that are not mapped to unicode points, cannot be used by any major browser (Opera is a possible exception). The only way of accessing these glyphs is therefore to edit the font file and provide a custom mapping.
  • The dvi file may also contain information about stretching/shrinking the glyphs, and other advanced transformation stuff. As I understand, this is part of the reason why Martin would have to spend some time on this. For my purposes, I would be happy with a simple --use-native-fonts option, which doesn't try to embed the font in the .svg file at all, and which ignores any font transformation commands present in the .svg file.

@tantau
Copy link

tantau commented Feb 23, 2016

I would like to make the following suggestions:

  1. There should be an option such as the suggested --use-native-fonts that do exactly as suggested and simply add links to external font files (using @font-face). It would be really, really nice if this worked both with luatex and also xelatex (though I really only need it for luatex...).
  2. For the issue with math glyphs that do not have a Unicode point assigned to them: Why not simply remove them from the elements and insert them using to render them with their outlines in the section? After all, this works very well for all characters anyway (as --no-fonts shows, it renders nicely on all browsers). The reason we do not want to use --no-fonts all the time is that it slows down the render and that it makes searching and selecting text impossible. Both do not seem to be an issue with glyphs that do not have a Unicode point assigned to them, anyway.

@pnsaevik
Copy link

@tantau Good to hear that others are interested in this feature as well :-)
Regarding your suggestion 2: Nonencoded glyphs are typically big operators (integral, sum) and parantheses. I like your suggestion of leaving these as outlined glyphs, although it's more difficult to implement for poor Martin.. But it would make it easier to use existing fonts, without requiring any modifications.

@mgieseki
Copy link
Owner Author

I agree that a feature to support native font references would be useful. However, it requires a lot of code refactorings to implement it properly. Because of other commitments I don't have enough free time to work on it at the moment. So please don't expect this feature to be available soon.

Here is a first list of things to be considered when implementing the new option --native-fonts (or the like):

  • Check if the current font format is supported by @font-face (only OTF and TTF seem to work correctly).
  • Check if the font assigns a Unicode point to the referenced character.
    • If so, add the character to the current text/tspan element but don't add it to the list of characters to be embedded
    • Otherwise, handle the character as before, i.e. create text or path elements depending on option --no-fonts (the former requires different font IDs for the native and embedded portions of the font)
  • Since CSS doesn't provide properties to stretch, embolden, and slant the glyphs by arbitrary values, these XeTeX and font map features must be implemented by applying transform and stroke-width attributes to the affected text elements.
  • Option --native-fonts should allow an optional parameter to specify a "prefix" for the file name in the @font-face rule. It can be used to give a path to the font file, for example.

The support of LuaTeX is a different story. Unfortunately, there's no easy way to determine the glyph indexes of the unmapped characters from the LuaTeX DVI files. Some stuff of the LuaTeX code had to be duplicated to get the proper information. Maybe it's possible to add further information about the characters to the DVI file (via specials, for example). But this should be implemented by the LuaTeX team first.

@pnsaevik
Copy link

@mgieseki I can easily imagine that it takes a lot of effort to do this properly, especially because of the stretch/slant/embolden properties that might be present in the dvi file.

In the meantime, would you consider adding descriptive names to the fonts that you embed in the svg file? At the moment, they are all named 'nf0', 'nf1' etc. If I could only get hold of the font name, I could post-process the SVG so that it only uses native fonts. I'm not going to need the stretch/slant/embolden features anyway.

@mgieseki
Copy link
Owner Author

@pnsaevik Would it help to add a comment with the file names to each CSS font rule (see this patch)? I can't use the font family name as ID because it's not unique across font files and would require some fiddling to derive a valid ID.

@thomas001
Copy link

What about a slight different proposal:
pdf2htmlEX is a tool to convert pdf files to html. In order to deal with fonts embeded in the PDF file, they generate custom font files (.woff) and reference these in CSS/HTML.
Would it be possible to generate custom .woff font files and reference these in the SVG?
These font files would basically replace the, badly supported, SVG fonts.
It is possible to generate .woff files using font forge from .sfd files. Sfd files look like they can be easily generated.

@mgieseki
Copy link
Owner Author

As far as I know, pdf2htmlEX uses FontForge to handle all the font stuff. I don't know much about the FontForge internals and can't estimate yet how much work it would be to add WOFF font generation. Probably much more than I can currently afford. But if you're aware of a C/C++ library or FontForge wrapper that can easily be used to re-encode OTF fonts and wrap them into WOFF files, let me know.

@pnsaevik
Copy link

@mgieski Yeah, that looks like precisely what I need! I haven't tried to build your code though, is it hard to set up? Or could you possibly release a new version with the proposed patch...?

@thomas001 As long as we have a code to generate custom font files, it is a secondary issue whether the font is .woff or .otf (they are quite similar, although .woff is better for web usage). I have already a python script that creates custom .otf files, and could probably extend it to .woff as well. It uses the font library FontTools/ttx. Don't know how easy it would be to integrate with Martin's code.

@thomas001
Copy link

I hacked a bit font support using fontforge. I don't know if the generation is really correct, but maybe it is a good starting point. You can have a look at https://github.com/thomas001/dvisvgm .
You can see an example here [tex code, dvi file).

@pnsaevik
Copy link

@thomas001 This looks like awsome stuff indeed 👍 A couple of questions:

  1. Have you tested it with unicode-math and opentype math fonts, e.g., XITS Math? My guess is that you would run into problems with the display style integral with your current code.
  2. Do you know if the fontforge binary (for win/mac/unix) can legally be bundled with dvisvgm?

@mgieseki
Copy link
Owner Author

@thomas001 Thank you for taking the time to dig into the details of FontForge and for patching the dvisvgm code to create WOFF files. Since FontForge is a big font editor application that doesn't work well on Windows systems (mostly because of its X11 dependencies), I'm a bit reluctant to rely on this external application. I'd rather link only a portion of the required FontForge code, e.g. as done by LuaTeX and pdf2htmlEX. Maybe pdf2htmlEX' FontForge wrapper ffw can be used here. I have to investigate this further.

@pnsaevik I'll add a new option --comments that enables the creation of additional SVG/CSS comments containing the font information. I guess, it's better to make the comments optional because many users prefer smaller files -- even if it's only a few bytes. 😄

@pnsaevik
Copy link

@mgieseki Thanks Martin, the new option works like a charm and is exactly what I need. I'll be sure to make good use of it :-)

@thomas001 I can understand Martin's reluctancy towards relying on fontforge being correctly installed on the system. Still, your solution with base64-encoded woff fonts is exactly what is needed here. Would it be an idea if you rewrite your patch as a standalone tool, which parses Martin's svg files, finds the embedded svg font information, and converts them to base64 woff? This would be equally useful to end users, and possibly have a broader domain of application as well. And Martin can keep his code clean and lean :-)

@mgieseki
Copy link
Owner Author

Yesterday, I had a closer look at the FonfForge sources. It seems that it shouldn't be too complicated to remove all the GUI and most unused gnulib stuff from libfontforge and create a stripped-down variant. I have already a first version that compiles successfully on Windows as well. It needs some more testing, though.

@mgieseki
Copy link
Owner Author

I finally found some time to add support for embedding the fonts in TTF, WOFF, or WOFF2 format rather than SVG (new option --font-format). This feature relies on the FontForge library so that I could use some of the code by @thomas001. Thanks again for taking the time to implement the experimental version. Since I managed to create a drastically reduced version of the FontForge library that natively builds on Windows, the WOFF feature doesn't rely on the FontForge application with all the GUI stuff. I'll publish that code in a separate repository.
Please feel free to test the latest commits. I need some more time to create a new release which will be version 2.0. There's also some more things that changed, e.g. a switch to C++11.

@pnsaevik
Copy link

This is great news! I'm very much looking forward to seeing the new release ;-)

@mgieseki
Copy link
Owner Author

dvisvgm 2.0.1 is already available in the Release section. :-) A few more information on how to use the new option can be found on the news page and in the manual.

@mgieseki mgieseki self-assigned this Sep 12, 2016
@pnsaevik
Copy link

Martin, this is truly marvelous :-) I now intend to write an extension to Python-Markdown, which would allow us to write Markdown with LaTeX math, and convert it to HTML/SVG using dvisvgm. I believe this would produce superior results compared to MathJax and KaTeX for static documents, since these alternatives renders math on-the-fly. But with upcoming conferences and submission deadlines, I'll have to wait a couple of weeks...

@pnsaevik
Copy link

I couldn't let go of this... so I've spent some time fiddling around with it anyway. I now have a python script that grabs <eq> tags in the html, reads the latex within it, feeds it to xetex and then to dvisvgm, and replaces the eq tags with a link to the newly created svg.

I have a problem though: It seems that dvisvgm doesn't like math that is created with the unicode-math package. The svg images seems to be cropped to the wrong size, for some reason.

To reproduce, use the following MWE:

\documentclass{standalone}
\usepackage{fontspec}
\usepackage{unicode-math}
\begin{document}
$E = mc^2$
\end{document}

@mgieseki
Copy link
Owner Author

I'm glad to hear that you find the new additions helpful and that you're trying them out.
The clipping issue is related to the TFM data of the used font. Obviously, the actual glyphs exceed the box extents given in the TFM file. In order to avoid this behavior, just call dvisvgm with option --exact.

@pnsaevik
Copy link

Thanks, this certainly solves the problem with the bounding box. But is there a way to extract the baseline of the text from the svg? For an inline equation, I need to shift the svg slightly downwards in order to make it align with the baseline of the surrounding text.

@mgieseki
Copy link
Owner Author

mgieseki commented Sep 16, 2016

OK, I see. The easiest way to get height/depth values is to use the preview package, e.g.

\documentclass{article}
\usepackage{fontspec}
\usepackage{unicode-math}
\usepackage[dvips,active,tightpage]{preview}
\begin{document}
\begin{preview}
$E = mc^2$
\end{preview}
\end{document}

Since a plain DVI/XDV file doesn't contain any explicit information about the baselines, it's hard to extract reliable values from it. The preview package adds some more data that helps to compute the proper width, height and depth of the graphics. If dvisvgm finds preview data in a DVI file, it prints two additional lines of output to the console, e.g.

computing extents based on data set by preview package (version 11.89)
width=41.6pt, height=8.28621pt, depth=1.43125pt

You can parse them and use the depth value to adjust the vertical position of the graphics to line them up with surrounding text. The pt unit denotes TeX points (1in = 72.27pt).

@pnsaevik
Copy link

pnsaevik commented Sep 16, 2016

A very good suggestion, but unfortunately this does not seem to work when --exact is used (I get a depth of zero..). On the other hand, when --exact is used, it seems like the viewBox attribute of <svg> contains the information I need! Apparently, the y=0 coordinate is located at the baseline in this case... which means that the top coordinate plus the total height equals the depth.

@mgieseki
Copy link
Owner Author

I just had a closer look at the problem and found a stupid sign bug introduced with version 2.0. It leads to negative depth values for native fonts. That's why the equation is cropped without --exact. So it's not related to the TFM data. I will fix this soon.
Nonetheless, with option --exact you should get a tight bounding box for E=mc² that sits directly on the baseline because the descenders of all characters are 0. That's why the resulting depth value is 0 too. When adding a factor "g" to the equation, the depth is non-zero, the SVG must be vertically adjusted, and I get detention in physics class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature feature request
Projects
None yet
Development

No branches or pull requests

4 participants