TextToPath cache #66

andersmelander · 2019-04-11T17:03:12Z

As mentioned in issue #65 it would be good for performance if some sort of cache was involved in the text rendering pipeline.

Well, here you go:

GR32_Text_VCL.zip

The attached modifications implements a cache in the VCL backend TextToPath functions.
With the cache enabled the performance of the Text rendering test in the benchmark example has more than doubled. Where do I collect my paycheck? :-)

The cache has been inserted into the existing implementation using conditional defines so it can be enabled or disabled at compile time.

The cache is implemented with generics TDictionary<> using LOGFONT as the primary key into the dictionary and the glyph character value as a secondary key.

I'm caching the font metrics, the individual glyph metrics and the individual TT glyph polygon data. It will be trivial to add kerning data as well. As could be expected only the caching of TT polygon data seems to have a significant impact on the performance.

I have not tried to profile or optimize it in any way.

The text was updated successfully, but these errors were encountered:

AngusJohnson · 2019-04-12T00:13:09Z

While I haven't yet looked at the code, I've given it a quick test drive and it seems to work just fine.

Where do I collect my paycheck? :-)

I recommend we double your salary :).

andersmelander · 2019-04-12T14:44:53Z

Changes are now in the TextToPath-cache branch.

andersmelander · 2019-04-15T16:25:07Z

I tried enhancing the cache replacement policy to make it more resistant to trashing. Since I store the hit count on each individual glyph there's plenty of data available to make this possible.

My results show that the gain is nominal. First one has to make the cache really small before items start being replaced in the cache. The size of a single glyph is only 400-500 bytes and with a default max cache size of 256K there's room for approx. 500 different glyphs. When the polygon benchmark example is executed it uses about 32Kb caching ~60 glyphs. In other words: With typical use the cache will probably never trash, so the added complexity (and overhead) of a more intelligent cache replacement policy is simply not worth it.

FTR, the improved algorithm I used considers the "worth" of an item before purging it from the cache.
When the cache must be trimmed to make room for a new entry, I look at the LRU item. If the hit count on the item is greater than the sum of the hit count of all the remaining items, then the item is considered too valuable to be purged and I move on to the next item in the LRU list. A down side of this is that once an entry has become very valuable it is almost impossible to get it out of the cache no matter how old and irrelevant it becomes - A bit like Russian politicians. To solve this problem I would need to add some sort of aging mechanism that devaluates items with age.

andersmelander · 2019-04-16T18:58:28Z

The current cache implementation is by design not thread safe.
It would be relatively simple to make it thread safe, but since it is only used by the existing TextToPath functions, which are themselves not thread safe, I didn't see any point in doing so.

The question however is if it is by design that TextToPath isn't thread safe?

andersmelander · 2019-04-16T21:22:18Z

I've done a bit of quick benchmarking.

Caching the raw font outline data, as returned by GDI, improves the text rendering performance by 95-150% depending of the renderer used (VPR or VPR2).
Caching the path (TArrayOfArrayOfFloatPoint) instead improves the performance by 135-220%.

An increase in performance by 100% means that the rendering takes half the time it did before or that two times the amount of text can be rendered in the same time. So cached text is approximately between two and three times faster than uncached text.

The variance in gain between VPR and VPR2 is due to the performance of the different rasterizers. The faster the rasterizer, the less overhead rasterizing and the more benefit from the cache.

None of the cache implementations has been optimized yet but I expect any optimization gains to be dwarfed by the performance improvements already achieved by this proof of concept implementstion.

andersmelander added the enhancement label Apr 11, 2019

AngusJohnson assigned andersmelander Apr 12, 2019

andersmelander added this to To do in 3.x (New features) via automation Apr 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TextToPath cache #66

TextToPath cache #66

andersmelander commented Apr 11, 2019

AngusJohnson commented Apr 12, 2019

andersmelander commented Apr 12, 2019

andersmelander commented Apr 15, 2019

andersmelander commented Apr 16, 2019

andersmelander commented Apr 16, 2019

TextToPath cache #66

TextToPath cache #66

Comments

andersmelander commented Apr 11, 2019

AngusJohnson commented Apr 12, 2019

andersmelander commented Apr 12, 2019

andersmelander commented Apr 15, 2019

andersmelander commented Apr 16, 2019

andersmelander commented Apr 16, 2019