In 2005 Loop & Blinn [0] found a method to decide if a sample / pixel is inside or outside a bezier curve (independently of other samples, thus possible in a fragment shader) using only a few multiplications and one subtraction per sample.
- Integral quadratic curve: One multiplication
- Rational quadratic curve: Two multiplications
- Integral cubic curve: Three multiplications
- Rational cubic curve: Four multiplications
It's referenced in slug's algorithm description paper [1], the main disadvantage with Loop-Blinn is the triangulation step that is required, and at small text sizes you lose a bit of performance. Slug only needs to render a quad for each glyph. That is not to say that any one method is better than the other though! They both have advantages and disadvantages. I think the two most advanced techniques for rendering vector graphics on the GPU are "Massively Parallel Vector Graphics" [2] and "Efficient GPU Path Rendering Using Scanline Rasterization" [3]. Though I don't know of any well known usage of them. Maybe it's because it's very hard to implement them, the sources attached to them are not trivial to understand, even if you've read the papers. They also use OpenCL/Cuda if I remember correctly.
EDIT: I've only now seen that [2] and [3] are already mentioned in the article
EDIT2: To compensate for my ignorance, I will add that one of the authors of MPVG has a course on rendering vector graphics: http://w3.impa.br/~diego/teaching/vg/
If I understand correctly the second link is basically an extension of Loop-Blinns implicit curve approach with vector textures in order to find the winding counter for each fragment in one pass.
>> Slug only needs to render a quad for each glyph.
I don't know how many glyphs you want to render (to the point that there are so many that you can't read them anymore), but a modern GPU s are heavily optimized for triangle throughput. So 2 or 20 triangles per glyph makes only a little difference. The bigger problem is usually the sample fill rate and memory bandwidth (especially if you have to write to pixels more than once).
I have been eying the scanline-intersection-sort approach (your third link) too. Sadly they have no answer to path stroking (same as everybody else) and it also requires an efficient sorting algorithm for the GPU (implementations of such are hard to come by outside of CUDA, as you mentioned).
Indeed, most techniques that target the GPU have no response to stroking, they recommend generating paths beforehand so that it looks like it's stroked.
And yes, the number of triangles doesn't really make a difference in general, but in Slug's paper they say:
"At small font sizes, these triangles can become very tiny and decrease thread group occupancy on the GPU, reducing performance"
I'm not experienced enough to say how true that is/how much of a difference it makes.
> If I understand correctly the second link is basically an extension of Loop-Blinns implicit curve approach with vector textures in order to find the winding counter for each fragment in one pass.
I've read the paper, but to be honest it's a bit over my head right now, but AFAIK MPVG is an extension to this [1], which looks like it's an extension to Loop-Blinn itself, so I think you're right.
You can always render the text to a texture offline as a signed distance field and just draw out quads as needed at render time. This will always be faster than drawing from the curves, and rendering from an SDF (especially multi-channel variants) scales surprisingly well if you choose the texture/glyph size well.
Is there a serious risk of patent enforcement in common open source repositories ranging from GitHub to PPAs and Linux package repositories located outside any relevant jurisdictions?
Does that imply it's possible to implement 2D font/vector graphics rendering on a GPU and end up getting burned by patent law? I am having a hard time imagining they were awarded such a generic patent.
Anyway, I will adjust my question based on your feedback.
Slug isn't great for lots of shapes since it does the winding order scanning per-pixel on the pixel shader. It does have a novel quadratic root-finder. Put simply, it's better suited to fonts than large vector graphics.
I've once implemented the basic idea behind the algorithm used in slug(described in the paper [1], though without the 'band' optimization, I just wanted to see how it works), and I agree with you, the real innovation is in that quadratic root-finder. It can tell you whether you are inside or outside just by manipulating the three control points of a curve, it's very fast, what remains to be done is to use an acceleration data structure so that you don't have to check for every curve. That works very well for quadratic Bézier curves, in the paper it says that it can be easily extended to cubics, though no example is provided(and I doubt it's trivial). What I think would be hard with Slug's method is extending it to draw gradients, shadows, basically general vector graphics like you say. Eric Lengyel on his twitter showed a demo [2] using Slug to render general vector graphics, but I'm not sure of how many features it supports, but it definitely supports cubic Bézier curves. I'd also like to add that the algorithm didn't impress me with how the text looks at small sizes, which I think is very important in general, though maybe not so much for games(maybe I just didn't implement it correctly).
0. http://sluglibrary.com/
1. http://jcgt.org/published/0006/02/02/paper.pdf