Hacker News new | past | comments | ask | show | jobs | submit login
State of Text Rendering 2024 (behdad.org)
204 points by behdad 6 months ago | hide | past | favorite | 66 comments



Wait, so the idea is in the future, fonts will contain arbitrary full-color vector and bitmap images, they will contain Web Assembly code that needs to be run, and they will be streamed over the network? I’m probably missing a few other proposals, as I only skimmed.

Did anyone stop to consider if this is really necessary? The author makes it sound like he has used his influence steadily over the years to make fonts more complicated. “In year X, I proposed that fonts be able to do Y, because why not?” I get that text shaping is so complex, that in terms of open source, there is just Harfbuzz. I’m not an expert in this area. But I don’t think it’s a good thing if “font standards” are constantly getting new features, like web standards, and font renderers are like mini browser engines, where the sheer scope and number of features and rate of new features keeps everyone using the same codebases.


Well, we started this journey with bitmap fonts that wouldn't scale up, then most computers have been Western-oriented and didn't include full character sets for most of the people living on this planet. Then it turned out to be useful to be able to render languages and glyphs from dead ancient languages. And most languages other than English use all sorts of weird combining systems and accenting. When you type English it looks just like the keys you press. But type Arabic, especially with vowels, and it looks nothing like the keys you're hitting due to all the joining.

But the real kicker was emojis which threw a real spanner in the works. Prior to this text rendering had been universally mono, but we really had to add color then.

It's really about being inclusive. Writing (historically) was always something very analog, varying wildly between people, with all sorts of unbelievably arcane rules. Tech is just finally catching up with 5,000 years of history.



> Moreover, the fontations platform, which is the Rust framework Oxidize is producing, will unify font compilation and consumption, reducing the number of places new font-format features need to be implemented from three (FontTools, FreeType, and HarfBuzz) to one (Fontations), which would reduce development cost and overhead.

Will FreeType and HarfBuzz remain supported as C/C++ projects long-term, I wonder? Asking as someone who depends on these and doesn't want to introduce a dependency on a Rust compiler :)

Anecdotally, I notice a lot of game developers avoid FreeType and Harfbuzz entirely, instead opting for much worse text rendering in the form of stb_truetype.h only (Dear Imgui uses this, for example) - which 'is nice because it is a single header C file' but sucks with international languages; many people use SDFs for similar reasons.

I think the proposed move to WASM fonts, if done right, could make it easier to reduce the amount of code people need to render fonts (if the WASM font does the heavy lifting, and a small C program could render it) and alleviate this trend of people not using a good text rendering stack


I wouldn't worry about it. I don't see evidence that these Rust libraries have the kind of uptake that is alleged in the article. I think we'll be stuck with Freetype and HarfBuzz for a long time.

I have been hacking on a Rust program recently and I am using Freetype and Harfbuzz via FFI because the Rust packages he names don't appear to be mature yet.


> doesn't want to introduce a dependency on a Rust compiler

you can link against pre-built .dll/.so/.dylib from your C++ code base.


That only helps you if all your targets are also actively supported by upstream. It also limits optimization opportunities, mainly removing parts of the library your code doesn't need.


I'm confused. When has a wasm interpret and small C program ever been the same thing?


I think parent is saying the choice is between OpenType+large_C_library and font_WASM+small_library+generic_WASM_engine.

Interpreting OpenType is very complex.


> [It is ironic indeed, that a text about text rendering, is presented in such an inaccessible and badly-typed environment. This is a Google Docs Preview page. I am still yet to find a solution that provides the same features (collaboration, commenting, live edits) and is presented better. Suggestions are appreciated.]

Have you tried Observable? Their online notebook has live team editing built in, and the option to comment on cells, as well as fork and merge notebooks with suggested edits. However, it uses markdown instead of a WYSIWYG editor (although I did create some tagged template wrappers for djot and markdeep as possible alternatives). On the plus side it's really easy to write interactive demos!

They're kind of pivoting their product at the moment though, so I'm not sure how easy it is to get into the notebook part these days.

[0] https://observablehq.com/@jobleonard/djot

[1] https://observablehq.com/@jobleonard/wrapping-markdeep-into-...


Tangent, but I just found Observable. How are they pivoting their product?


Before they tried to build a business with their notebooks alone, but they recently released Observable Framework[0], which is "a static site generator for data apps, dashboards, reports, and more" that you can install and run offline via npm[1]. IIUC the business plan is to make money by hosting the generated sites on their website, and offering tight integration with their existing notebook tech to continue to let people edit those cooperatively.

I really hope they succeed because I'm a big fan of both Observable notebooks and the new Observable Framework (and their Observable Plot library is pretty good too. Really, they have tons of good stuff).

But basically, they now have two products, as mentioned in the "overview" section of their docs[2].

[0] https://observablehq.com/framework/

[1] https://github.com/observablehq/framework

[2] https://observablehq.com/documentation/learn/overview


This is a lovely write up, but oh boy... when I see:

> Finally, I proposed that the future of font shaping and drawing/ painting will involve a fully-programmable paradigm.

> Two tongue-in-cheek mis-uses of the HarfBuzz Wasm shaper appeared recently: the Bad Apple font, and llama.ttf. Check them out!

See... the thing about solving problems is that, eventually you realize that any kind of problem can be solved by simply making a platform that can execute arbitrary code.

...and you see it again and again.

Declarative compile definition in make? Why not just make it code and use cmake?

Declarative infra in terraform? Why not just make it code and use pulumi?

Declarative ui? Why not just make it code and use javascript?

Database that holds data? Why not just use stored procedures and do it in code?

The list just goes on and on, and every time you have to roll back to a few important questions:

- Is the domain so complicated you really need arbitrary code execution?

- Are you inventing a new programming language for it?

- How will people program and test for the target?

- How will you manage the security sandbox around it?

It's just such a persuasive idea; can't figure out how to deal with the complexity in configuration? Make it code and push the problem to the user instead!

Sometimes it makes sense. Sometimes it works out.

To be fair, I get it, yes, font layouts are harder than they look, and yes, using WASM as a target solves some of those issues, but I look at llama.ttf and I really have to pause and go...

...really? Does my font need to be able to implement a LLM?

I don't really think it does.

...and even if the problem is really so complex that rendering glyphs requires arbitrary code (I'm not convinced, but I can see the argument), I think you should be doing this in shaders, which already exist, already have a security sandbox and already have an ecosystem of tooling around it.

I think inventing a new programmable font thing is basically inventing a new less functional form of shaders, and it's a decision that everyone involved will come to regret...


I don't disagree. But I have kind of the opposite take.

I can't count how many times I've seen simple code get turned into a hideously complex declarative language that has serious gaps.

Simple UI library code? Turn it into a custom declarative syntax that is now limited!

Simple build system that works and is debuggable? Turn it into a declarative syntax that can't be debugged and can't handle all the edge cases!

And so on and so forth.

I will admit that the idea of a font programming language sounds genuinely awful to me. So I don't really disagree with your premise. But I'm increasingly annoyed with declarative system when vanilla code is often simpler, more flexible, and more powerful (by necessity). :)


programmable hinting was already a thing. it's just switching to wasm from a bespoke language


It's still just doing exactly what shaders do, which is crazy.

Explain to me exactly why, other than 'I guess someone already implemented some kind of basic version of it' that you would have to have custom CPU code rendering glyphs instead of a shader rendering SDF's like literally everyone does with shaders already?

It's not a good solution. It's a bad, easy solution.

We have a solution for running arbitrary GPU accelerated graphics instructions; it has a cross platform version with webGPU.

This font thing... looks a lot like 'not invented here' syndrome to me, as an uninvolved spectator.

Why would you chose or want not to use GPU acceleration to render your glyphs?

What 'arbitrary code' does a font need to do that couldn't be implemented in a shader?

Maybe the horse has already bolted, yes, I understand programmable fonts already exist.. but geez, its incomprehensible to me, at least from what I can see.


> Explain to me exactly why, other than 'I guess someone already implemented some kind of basic version of it' that you would have to have custom CPU code rendering glyphs instead of a shader rendering SDF's like literally everyone does with shaders already?

Shaping is different compared to rendering glyphs themselves. SDF renderers (and other GPU text renderers like Slug) still do shaping on the CPU, not in shaders. Maybe some experiments have been done in this area, but I doubt anyone shapes text directly in the GPU in practice.

Think of it like a function that takes text as input, and returns positions as output. Shaders don't really know anything about text. Sure you could probably implement it if you wanted to, but why would you? I think it would add complexity for no benefit (not even performance).


lengyel told me he has implemented some sort of hinting on the gpu for slug (i suspect it's not programmable, but didn't ask)


Very interesting. Honestly I don't know much about hinting, but I suspect the whole shaping stack that Slug supports:

> kerning, ligature replacement, combining diacritical mark placement, and character composition. Slug also supports a number of OpenType features that include stylistic alternates, small caps, oldstyle figures, subscripts, superscripts, case-sensitive punctuation, and fractions.

Probably still uses the CPU.


> CPU code rendering glyphs instead of a shader rendering SDF's

1) Because SDFs suck badly (and don't cover the whole field) when you want to render sharp text. SDFs are fine when used in a game where everything is mapped to textures and is in motion at weird angles. SDFs are not fine in a static document which is rendered precisely in 2D.

2) Because GPUs handle "conditional" anything like crap. GPUs can apply a zillion computations as long as those computations apply to everything. The moment you want some of those computations to only apply to these things GPUs fall over in a heap. Every "if" statement wipes out half your throughput.

3) Because "text rendering" is multiple problems all smashed together. Text rendering is vector graphics--taking outlines and rendering them to a pixmap. Text rendering is shaping--taking text and a font and generating outlines. Text rendering is interactive--taking text and putting a selection or caret on it. None of these things parallelize well except maybe vector rendering.


I feel like, looking at the complexity of the programs that can be implemented in shaders (eg. https://dev.epicgames.com/documentation/en-us/unreal-engine/...) that it's unreasonable, bordering on disingenuous to suggest that the GPU pipeline is not capable enough to handle those workloads, or produce pixel perfect outputs.

Be really specific.

What exactly is it that you can't do in a shader, that you can do in a CPU based sandbox, better and faster?

(There are things, sure, like IO, networking, shared memory but I'm struggling to see why you would want any of them in this context)

I'll accept the answer, 'well, maybe you want to render fonts on a toaster with no GPU'; sure... but that having a GPU isn't good enough for you, yeah... nah. I'm not buying that).


Vector graphics are really hard to do on a GPU in an efficient manner. The way the data is stored as individual curve segments makes it difficult to parallelize the coverage problem, it's equivalent to a global parse; the best approaches all do some form of parsing of curve data on the CPU, either rasterizing fully on the GPU, or putting it in a structure the GPU can chew on.

But again, this has nothing to do with HarfBuzz or wasm.


https://sluglibrary.com/ - game library used in many engines to do vector graphics and fonts directly on the GPU


Slug preprocesses font curve data into something without the need for the global parse with the .slug file format.


what exactly do you mean by 'global parse'? it's very usual, i think, when operating on data stored in files, to parse them into in-memory structures before operating on them? but it feels like you are talking about something specific to vector rendering

slug builds acceleration structures ahead of time. the structures are overfit to the algorithm in a way that ttf should be but which is economical for video games. that doesn't seem like an interesting concern and nothing about it is specific to the gpu


I'm referring to needing to traverse all path segments to determine the winding order for an individual pixel. You can't solve this problem locally, you need global knowledge. The easiest way to do this is to build an acceleration structure to contain the global knowledge (what Slug does), but you can also propagate the global knowledge across (Hoppe ravg does this).


It's more about the nature of the problem, not that you can't do it in shaders. After all, I think you can do pretty much anything in shaders if you try hard enough.

Even if you already have a GPU renderer for glyphs and any other vector data, you still want to know where to actually position the glyphs. And since this is highly dependent on the text itself and your application state (that lies on the CPU), it would actually be pretty difficult to do it directly on the GPU. The shader that you would want should emit positions, but the code to do that won't be easily ported to the GPU. Working with text is not really what shaders are meant for.


It has nothing to do with shaders? Despite the name, shaping is not the same thing as a shader, shaping selects and places individual glyphs given a collection of code points.

No part of the rasterizer or renderer is configurable here. As mentioned above, the rasterizer is already programmable with up to two different bespoke stack bytecode languages, but that has nothing to do with shaping through wasm.


I agree shaders would be a terrible choice for this.

However, the article clearly states there are intentions to move towards much more than just shaping in wasm:

> I proposed that the future of font shaping and drawing/ painting will involve a fully-programmable paradigm.

> Bad Apple will become much easier and faster when we introduce the draw API in Wasm.

> Drawing and painting API will eventually come to HarfBuzz, probably in 2025.


This is still not rasterization, but a way to modify glyph outlines on the fly. How they are rasterized eventually should be mostly unchanged.


Hi. As mentioned, I'll expand on my motivations in a future paper. -behdad


While this presentation is extremely interesting, it would have been far more useful if you would have exported this view into a downloadable PDF file, instead of giving access to just this ephemeral preview.


You have to shape the text even if you render the glyphs with an SDF or MSDF. You're conflating varius things


You are missing the point. Finally, it'll be possible to exploit side-channel attacks without using Javascript in the browser. Static HTML+CSS+Fonts websites won't be safe anymore. Yippieee! Finally, there is no reason to allow disabling Javascript anymore.

And you could add backdoors to hacked fonts that are activated by magic spells. Isn't it great.


What else are we supposed to do with all these massively multicore CPU's and gigabytes of RAM?


I really have to install an old system with the Infinality patches again, I remember them to be quite excellent, but that was a few years, monitors and contact lense upgrades ago…

Maybe I'll put RiscOS on a Raspberry Pi at the same time, which (IIRC) had one of the first antialiased font rendering engines ever.

(I do have some old Macs running currently, and weirdly enough still prefer some of the old "blurry" font renderings to a lot of modern ones, at least on regular displays)


RustyBuzz is quite limited when compared with HarfBuzz. The Rust fonts scene also seems to lack the necessary momentum to drive things through. I’m not sure whether it’s because the Rust community is mostly interested in webtech or whether Rust itself makes it hard to solve such complex problems. But I don’t see the sands shifting in less than 10 years to come.


>RustyBuzz is quite limited when compared with HarfBuzz.

As the main author of rustybuzz I'm surprised to hear this. If you need a text shaper, rustybuzz is mostly a drop-in replacement for harfbuzz.

Text shaping and TrueType parsing are hard problems, but Rust does not make them more complicated. Quite the opposite. In fact, rustybuzz is written in 100% safe Rust. I would even go further and say that Rust is the only sane option for solving text-related tasks.


I can't quite see why Rust would make it harder, it's always a breeze compared to working on C++ projects, especially as the project matures. The compiler catches a lot of issues early, which otherwise slow down C++ projects as they accumulate silently, because they should have fixed all warnings, should have used ASan, TSan, MSan, religiously, and should have agreed on a manageable subset of C++, ... but of course didn't.

But I think the power of legacy and a bigger community is not to be underestimated.


Servo is tagged as dead but the revival seems to have somewhat worked. I guess nto nearly as many full time people than when it's was under Mozilla umbrella but not dead. And by the look of it neither is Pathfinder the rasteriser.


This is mentioned later in the document:

> An experimental engine Servo originally launched by Mozilla as a successor to Gecko, is implemented in Rust. It was eventually abandoned by Mozilla when Mozilla Corporation announced laying off a quarter of its staff in 2020 and transferred to The Linux Foundation, then in 2023 to TLF Europe. While still experimental, it has been under active development again since 2023. Servo currently uses Rust bindings to Harfbuzz.

I'd say Servo is going pretty well. The git is fairly active and monthly updates on their blog paint a positive picture of the rate of progress. I try the engine out roughly monthly after the blog posts drop and when I last did a few days ago I was impressed to see a lot of my most used websites being displayed correctly. At this rate, I think it could become viable much sooner than we think. However, the project is still critically underfunded, currently only getting a monthly $2229USD according to their website.


On rasterization & basic typesetting (no ligatures/gsub lookups): check out pixie https://github.com/treeform/pixie


Hinting? No.


To everybody complaining about the awful text rendering, it's literally in the first paragraph

> [It is ironic indeed, that a text about text rendering, is presented in such an inaccessible and badly-typed environment. This is a Google Docs Preview page. I am still yet to find a solution that provides the same features (collaboration, commenting, live edits) and is presented better. Suggestions are appreciated.]


Acknowledging the irony doesn't remove it. It's yet another case of putting developer convenience over end user experience. Which for something provided for free is more than you paid for but still a subpar result that can be criticized.


Why do you need those features for a published long document? Do the editing and collaboration in an ugly platform, but publish in something more readable?


I'm not sure how this gets rendered, but the lack of hinting makes it a strain to read. What irony that an article about progress in text rendering has such awful rendering quality.

PS: That is in Firefox. In Chrome it uses what appears to be a bitmap font, which is much worse.


I think the author could've used "Publish to web" instead of preview mode, for a more accessible article view with better font rendering and keyboard control (though it changes image size and breaks right-aligned text).


I believe this is a Google doc preview. On my iPad it had terrible scrolling performance, and the shortcut to scroll to the top of the page (tap the top edge of the screen) doesn't work :(


indeed it is, as meanwhile confirmed at the top of the document.


For me in Chromium, I got the error message "Some fonts could not be loaded" and the body text was in an actually readable font, unlike in Firefox.

I got annoyed that the page hijacked right click and key navigation, so I wanted to print to PDF — which didn't work. Chrome printed a single blank page. Firefox managed to print, but also only a single page, and when zoomed in the font got interpolated (= blurry), instead of being more readable.


This page breaks zoom and all scrolling keys (page up/down and space bar) in Firefox too on top of being ugly :/


I'm on Firefox and zoom/page up/down work but space doesn't. Also using own font (disabling allow pages choose their own fonts) results in broken page.


If you click the Google Docs link at the top of the document, you can then export it to html/txt/pdf.


Nice article but sometimes I don’t know what’s worse - the state of linux audio drivers, or the state of linux text rendering.

It’s very easy to see when FreeType is used because it just looks off in a few, but significant ways. I’ve used it with and without Harf. DirectWrite has been a joy by comparison.


I never had particular issues with Linux or Windows, but I have to say that I was surprised at how bad my MacBook looks when connected to an external 1080p monitor. I know Apple wants you to buy one of their fancy high resolution monitors or whatever, but it was so odd that one with a 1920x1080 resolution would look so bad. Even toggling the hi-dpi setting in BetterDisplay didn't help much.

Here's a quick picture from a few days ago: https://imgur.com/a/GLohlj1


I believe they don't even try to do hinting or sub-pixel rendering, where were key to Windows' crisp font rendering on low resolutions all the way back on Windows XP.


They don't indeed. Subpixel rendering was removed from macOS few years ago.


I’m resigned to the fact that I’ll be dead before Linux fonts look good. That’s true of many Linux features, actually. “Public” software is the quality of most public infrastructure, sadly, and smells vaguely of piss.


making your article force a tiny, fixed, unchangeable font size seems like a good way to convince people that you should be nowhere near text rendering infrastructure


The author is actually one of the most prominent and respected people in this space.


Browser zoom works fine for me. Does it not for you?


The page fights against zooming by reducing the text size to compensate.

I can zoom in somewhat only if I press Ctrl + fast enough many times in succession, but then it is easy to overshoot.


Interesting, when I zoom in to 300% the T in TLDR is exactly 3x taller than when at 100%. Seems to be an embedded Google Doc though so the canvas does a weird recentering jerk on viewport size changes.


Zoom works until the page width is about the browser width. After that it starts fighting your zoom level.


actually read the very beginning where it's clear why he's using this google doc thingy which the bad text...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: