Why 2D Vector Graphics Are Much More Complex Than 3D

Original author: Jasper St. Pierre
  • Transfer
Recently, a lot of fantastic research on 2D rendering has appeared. Peter Kobalicek and Fabian Aizerman are working on Blend2D : this is one of the fastest and most accurate CPU rasterizers on the market, with innovative JIT technology. Patz Walton of Mozilla studied not one, but three different approaches in Pathfinder , culminating in Pathfinder v3. Raf Levien built the computing pipeline using the technology described in a scientific article by Ghana with colleagues on vector textures (2014) . It seems that some distance development is marked by distance fields: Adam Simmons and Sarah Frisken independently work here .

One might ask: why is there so much noise around 2D? It can't be much harder than 3D, right? 3D is a completely different dimension! Here we have real-time ray tracing on the nose with accurate lighting, and you can’t master the plain 2D graphics with solid colors?

For those who are not very well versed in the details of the modern GPU, this is really very surprising! But 2D graphics have a lot of unique limitations that make it extremely difficult. In addition, it does not lend itself to parallelization. Let's take a walk along the historic path that brought us here.

PostScript Takeoff


In the beginning there was a plotter. The first graphic device capable of interacting with a computer was called a “ plotter ” (plotter): one or more pens that can move around paper. Everything works according to the pen-down command, then the drawing head moves in some unique way, possibly along a curve, and the pen-up command is received . HP, the maker of some of the earliest plotters, used a BASIC variant called AGL on the host computer, which then sent the plotter commands in another language, such as HP-GL . In the 1970s, graphic terminals became cheaper and more popular, starting with the Tektronix 4010. He showed the image using a CRT, but do not be fooled: this is not a pixel display. Tektronix came from the industry of analog oscilloscopes, and these machines work by controlling the electron beam along a specific path . Thus, the Tektronix 4010 did not have pixel output. Instead, you sent him commands in simple graphic mode , which could draw lines, but, again, in the "feather down", "feather up" mode.

As in many other areas, everything changed the invention of Xerox PARC. Researchers began to develop a new type of printer, more computationally expressive than a plotter. This new printer worked in a small stacked Turing-complete programming language similar to Forth, and it was called ... Interpress! Obviously, Xerox could not find him a worthy application, so the inventors left the ship and founded a small startup called Adobe. They took Interpress with them, and as they were corrected and improved, it changed beyond recognition, so they gave it another name: PostScript. In addition to the sweet, turing-complete stack language, the fourth chapter of the original PostScript Language Reference describes the Imaging Model, which is almost identical to modern programming interfaces. Example 4.1 from the manual contains sample code that can be translated almost line by line into HTML5.

/box {                  function box() {
    newpath                 ctx.beginPath();
    0 0 moveto              ctx.moveTo(0, 0);
    0 1 lineto              ctx.lineTo(0, 1);
    1 1 lineto              ctx.lineTo(1, 1);
    1 0 lineto              ctx.lineTo(1, 0);
    closepath               ctx.closePath();
} def                   }
gsave                   ctx.save();
72 72 scale             ctx.scale(72, 72);
box fill                box(); ctx.fill();
2 2 translate           ctx.translate(2, 2);
box fill                box(); ctx.fill();
grestore                ctx.restore();

This is not a coincidence.

Steve Jobs from Apple met with Interpress engineers during his visit to PARC. Jobs thought the printing business would be profitable, and tried to buy Adobe at birth. But Adobe put forward a counter offer and eventually sold Apple a five-year PostScript license. The third pillar in Jobs’s plan was to finance a small startup, Aldus, who made a WYSIWYG application for creating PostScript documents. It was called PageMaker. In early 1985, Apple introduced the first PostScript-compatible printer, the Apple LaserWriter. The combination of Macintosh, PageMaker and LaserWriter instantly turned the printing industry upside down, and the new hit "desktop publishing" strengthened PostScript for its place in history. Hewlett-Packard’s main competitor eventually also purchased a PostScript license for its rival LaserJet printer series. This happened in 1991 after pressure from consumers.

PostScript slowly moved from a printer control language to a file format. Smart programmers learned how PostScript commands are sent to the printer — and started creating PostScript documents manually by adding charts, graphs, and drawings to their documents, while PostScript was used to display graphics. There is a demand for graphics outside the printer! Adobe noticed this and quickly released the Encapsulated PostScript format., which was nothing more than a few specially formatted PostScript comments with metadata about image size and restrictions on the use of printer commands, such as “page feed”. In the same 1985, Adobe began developing Illustrator, an application where artists worked in Encapsulated PostScript format in a convenient UI. Then these files could be transferred to a word processor that created ... PostScript documents for sending to PostScript printers. The whole world has switched to PostScript, and Adobe could not be happier. When Microsoft was working on Windows 1.0 and wanted to create its own graphical API for developers, the main goal was to make it compatible with existing printers so that graphics sent to printers were as easy as a screen. This API was eventually released asGDI  is the main component used by every engineer during the rapidly growing popularity of Windows in the 90s. Generations of Windows programmers have begun to unknowingly identify 2D vector graphics with the PostScript image model, securing it with this de facto status.

The only major problem with PostScript was its turing completeness: viewing the 86th page of a document means first running the script for pages 1-85. And it can be slow. Adobe found out about this complaint from users and decided to create a new document format that did not have such restrictions; it was called “Portable Document Format” or, for short, “PDF”. The programming language was thrown out of it, but the graphic technology remained the same. Quote from the PDF specifications, chapter 2.1, “Image Model”:

PDF is based on its ability to describe the appearance of complex graphics and typography. This feature is achieved through the use of the Adobe image model, the same high-level, device-independent presentation used in the PostScript page description language.

When the W3C consortium considered applicants for the markup language of 2D graphics on the Internet, Adobe defended XML -based PGML , which was based on the PostScript graphics model:

PGML must include a PDF / PostScript image model to ensure scalable 2D graphics that meet the needs of both general users and graphics professionals.

Microsoft's competing VML format was based on GDI, which we know is based on PostScript. The two competing proposals, which were still essentially PostScript, merged together so that the W3C adopted the “scalable vector graphics” (SVG) standard that we know and love today.

Even if he is old, let's not pretend that the PostScript innovations brought to this world are anything less than a technological miracle. Apple's LaserWriter PostScript printer was twice as powerful as the Macintosh that controlled it, just for interpreting PostScript and rasterizing vector paths to points on paper. This may seem excessive, but if you've already bought a fancy laser printerinside, you don’t need to be surprised at the expensive processor. In its first incarnation, PostScript invented a rather sophisticated visualization model with all the features we take for granted today. What is the most powerful, awesome feature? Fonts At that time, fonts were drawn by hand with a ruler and a protractor and cast on film for photochemical printing . In 1977, Donald Knut showed the world what his METAFONT system is capable of., which he introduced with the text editor TeX, but she did not take root. It required the user to have a mathematical description of fonts using brushes and curves. Most font developers did not want to learn this. And bizarre bends with small sizes turned into a mess: the printers of that time did not have sufficient resolution, so the letters blurred and merged with each other. PostScript has proposed a new solution: an algorithm for “linking” outlines to coarser grids used by printers. This is known as grid-fitting. To prevent too much distortion of the geometry, they allowed the fonts to set “hints” about which parts of the geometry are most important and what should be preserved.

Adobe's original business model was to sell this font technology to printer developers and to sell publishers special recreated fonts with added hints, so Adobe still sells its versions of Times and Futura . By the way, this is possible because fonts, or, more formally, “headsets”, are one of five things that are explicitly excluded from the US copyright law , since they were originally designated as “too simple or utilitarian to be creative work”. Instead, a digital program that reproduces the font on the screen is copyrighted . So that people can not copy Adobe fonts and add their own, formatType 1 Font was originally owned by Adobe and contained a “font encryption” code. Only Adobe's PostScript could interpret Type 1 fonts, and only they implemented proprietary hint technology that delivers crispness in small sizes.

Mesh fitting, by the way, became so popular that when Microsoft and Apple were tired of paying Adobe license fees, they invented an alternative method for their alternative TrueType font format . Instead of specifying declarative “hints”, TrueType gives the font author a full-fledged turing-complete stack languageso that the author can control all aspects of the grid fit (avoiding Adobe patents for declarative hints). For years, a war has raged between Adobe Type 1 and TrueType, and font developers are stuck in the middle, giving users both formats. In the end, the industry has reached a compromise: OpenType . But instead of realistically determining the winner, they simply flopped both specifications into one file format. Adobe now earned money not from selling Type 1 fonts, but from Photoshop and Illustrator, so it removed the encryption part, refined the format, and introduced CFF  / Type 2 fonts , which were all included in OpenType as a cff table . TrueType on the other hand inserted as glyfand other tables. Although somewhat ugly, OpenType seemed to do the job for the users, mostly starving them: just require all software to support both types of fonts, because OpenType requires you to support both types of fonts.

Of course, we are forced to ask: if not PostScript, then what is in its place? Other options worth considering. The previously mentioned METAFONT did not use strictly defined outline paths (filled paths). Instead, Knut, in his typical manner, in the article "Mathematical Typography"He proposed for typography a mathematical concept that is "the most pleasant." You specify several points, and some algorithm finds through them the correct "most pleasant" curve. You can superimpose these outlines on top of each other: define one of them as a “feather”, and then “drag a feather” through some other line. Knut, at heart, a computer scientist, even introduced recursion. His student John Hobby has developed and implemented algorithms for calculating the “most pleasant curve” , overlaying nested paths and rasterizing such curves . For additional information on METAFONT, curves and the history of typography in general, I highly recommend the book Fonts and Encodings , as well as articles by John Hobby.

Fortunately, the renewed interest in 2D graphics research meant that Knut and Hobby splines weren’t completely forgotten. Although they are definitely abstruse and unconventional, they recently made their way to the Apple iWork Suite , and they are the default spline type there.

Take off triangles


Without going too far into mathematical jungle, at a high level, we call such approaches as Bezier curves and Hobby splines implicit curves , because they are indicated as a mathematical function that generates a curve. They look good at any resolution, which is great for 2D images designed for scaling.

2D graphics supported the momentum around these implicit curves, which are almost mandatory when modeling glyphs. The hardware and software for calculating these paths in real time was expensive, but a big push came from the printing industry for vector graphics, and most of the rest of the existing industrial equipment was already much more expensive than a laser printer with a fancy processor.

However, 3D graphics went a completely different route. From the very beginning, an almost universal approach was the use of polygons (polygons), which were often manually labeled and manually entered into the computer . However, this approach was not universal. The 3D equivalent of an implicit curve is an implicit surface made up of basic geometric primitives such as spheres, cylinders, and cubes. An ideal sphere with infinite resolution can be represented by a simple equation, so at the dawn of the development of 3D for geometry, it was clearly preferable to polygons . One of the few companies that designed graphics with implicit surfaces was MAGI. Combined with clever artistic use of procedural textures, they won a Disney light-bike design contract for the 1982 Tron movie. Unfortunately, this approach quickly came to naught. Thanks to the acceleration of the CPU and the study of problems such as “removing the hidden surface”, the number of triangles that you could display in the scene was growing rapidly, and for complex shapes it was much easier for artists to think about polygons and vertices that can be clicked and dragged, rather than use combinations of cubes and cylinders.

This does not mean that implicit surfaces were not used in the modeling process . Techniques like the Catmell-Clark algorithmby the beginning of the 80s, they had become a generally accepted industry standard, allowing artists to create smooth, organic simple geometric shapes. Although until the beginning of the 2000s, the Catell – Clark algorithm was not even defined as an “implicit surface” that can be calculated using the equation. Then it was considered as an iterative algorithm: a way to divide polygons into even more polygons.

Triangles invaded the world, followed by tools for creating 3D content. New developers and designers of video games and special effects in films were trained exclusively on modeling programs with polygon meshes, such as Maya, 3DS Max and Softimage. When “3D graphics accelerators” (GPUs) appeared on the scene in the late 80s, they were developed specifically to accelerate existing content: triangles. Although early GPU projects, such as the NVIDIA NV1 , had limited hardware support for curves, it was buggy and was quickly removed from the product line.

This culture mainly extends to what we see today. The dominant PostScript 2D image model began with a product that could display curves in "real time." At the same time, the 3D industry ignored curves that were hard to work with, and instead relied on stand-alone solutions to pre-convert curves to triangles.

Implicit surfaces are returned


But why could implicit 2D curves be calculated in real time on a printer in the 80s, and the same implicit 3D curves are still very buggy in the early 2000s? Well, the Catell-Clark algorithm is much more complicated than the Bezier curve. Bezier curves in 3D are known as B-splines, and they are well computable, but there is a drawback that they limit the way the grid is connected. Surfaces like Catmell-Clark and NURBS allow arbitrarily connected grids to extend the capabilities of artists, but this can lead to polynomials of more than fourth degree, which, as a rule, have no analytical solution . Instead, you get approximations based on the separation of polygons, as is done in OpenSubdivfrom Pixar. If someone ever finds an analytic solution to find Catmell’s roots — Clark or NURBS, Autodesk will pay him a lot. Compared to them, triangles seem much nicer: just calculate three linear equations on a plane , and you have an easy answer.

... But what if we do not need an exact solution? This is the question that graphic designer Iñigo Quilles asked.when conducting research on implicit surfaces. Decision? Signed distance fields (SDF) Instead of giving out the exact point of intersection with the surface, they say how far you are from it. Similar to the difference between the analytically calculated integral and the Euler integral, if you have a distance to the nearest object, you can "march" around the scene, asking at any given point how far you are, and passing this distance. Such surfaces have breathed a whole new life into the industry through the demoscene and communities like Shadertoy. The hack of the old MAGI modeling technique brings us incredible finds like Surfer Boyfrom Killes, calculated with infinite accuracy as an implicit surface. You do not need to look for the algebraic roots of Surfer Boy, you just feel how the scene goes.

Of course, the problem is that only a genius like Killes can create a Surfer Boy . There are no tools for SDF geometry yet, all code is written manually. However, given the exciting revival of implicit surfaces and the natural shapes of curves, there is now a lot of interest in this technique. MediaMolecule Dreams on PS4 is a content creation kit based on a combination of implicit surfaces. In the process, most of the traditional graphics are destroyed and recreated . This is a promising approach, and the tools are intuitive and interesting. Oculus mediumand unbound.io also did some good research on this issue. This is definitely a promising look at what the future of 3D graphics and next-generation tools might look like.

But some of these approaches are less suitable for 2D than you might think. In general 3D gaming scenes, as a rule, advanced materials and textures, but few geometry calculations, as many critics and sellers of questionable products immediately point out . This means that we need less smoothing, because silhouettes are not that important. Approaches like 4x MSAA may work for many games, but for small fonts with solid colors, instead of 16 fixed sample locations, you’ll probably calculate the exact area under the curvefor each pixel, which will give you as much resolution as you want.

Rotating the screen in a 3D game produces effects similar to saccadic suppression , as the brain reconfigures to a new look. In many games, this helps to hide artifacts in post-processing effects, such as temporary smoothing , which Dreams and unbound.io rely heavily on to get good scene performance. Conversely, in a typical 2D scene, we don’t have this luxury perspective, so trying to use it will make the glyphs and shapes boil and tremble with these artifacts in full. 2D looks different and expectations are higher. When zooming, panning, and scrolling, stability is important.

None of these effects can be implemented on the GPU, but they show a radical departure from “3D” content, with different priorities. Ultimately, rendering 2D graphics is complicated because it is about shapes — exact letters and symbols, rather than materials and lighting, which are mostly solid. As a result of evolution, graphics accelerators decided not to calculate implicit real-time geometry, such as curves, but instead focused on everything that happens inside these curves. Perhaps if PostScript had not won, we would have a 2D image model without Bezier curves as the main requirement for real-time. Perhaps in such a world, instead of triangles, the best geometric representations would be used, content creation tools focused on 3D splines, GPUs supported real-time curves at the hardware level. In the end, it's always fun to dream.

Also popular now: