SVG From the Ground Up — Along the right path

<svg xmlns="http://www.w3.org/2000/svg" width="22" height="22" viewBox="0 0 22 22">
    <path d="M20.658,9.26l0.006-0.007l-9-8L11.658,1.26C11.481,1.103,11.255,1,11,1 
    c-0.255,0-0.481,0.103-0.658,0.26l-0.006-0.007l-9,8L1.342,9.26
    C1.136,9.443,1,9.703,1,10c0,0.552,0.448,1,1,1 c0.255,0,0.481-0.103,0.658-0.26l0.006,0.007
    L3,10.449V20c0,0.552,0.448,1,1,1h5v-8h4v8h5c0.552,0,1-0.448,1-1v-9.551l0.336,0.298 
    l0.006-0.007C19.519,10.897,19.745,11,20,11c0.552,0,1-0.448,1-1C21,9.703,20.864,9.443,20.658,9.26z 
    M7,16H5v-3h2V16z M17,16h-2 v-3h2V16z"/>
</svg>

In the last installment (It’s XML, How hard could it be?), we got as far as being able to scan XML, and generate a bunch of XmlElement objects. That’s a great first couple of steps, and now the really interesting parts begin. But, first, before we get knee deep into the seriousness of the rest of SVG, we need to deal with the graphics subsystem. It’s one thing to ‘parse’ SVG, and even build up a Document Object Model (DOM). It’s quite another to actually do the rendering of the same. To do both, in a compact form, with speed and agility, that’s what we’re after.

This time around I’m going to introduce blend2d, which is the graphics library that I’m using to do all my drawing. I stumbled across blend2d a few years ago, and I don’t even remember how I found it. There are a couple of key aspects to it that are of note. One is that the API is really simple to use, and the library is easy to build. The other part, is more esoteric, but perfect for our needs here. The library was built around support for SVG. So, it has all the functions we need to build the typical kinds of graphics that we’re concerned with. I won’t go into excruciating detail about the blend2d API here, as you can visit the project on github, but I will take a look at the BLPath object, because this is the true workhorse of most SVG graphics.

The little house graphic above is typical of the kinds of little vector based icons you find all over the place. In your apps as icons, on Etsy as laser cuttable images, etc. Besides the opening ‘<svg…’, you see the ‘<path…’. SVG images are comprised of various geometry elements such as rect, circle, polyline, polygon, and path. If you really want to get into the nitty gritty details, you can check out the full SVG Specification.

The path geometry is used to describe a series of movements a pen might make on a plotter. MoveTo, LineTo, CurveTo, etc. There are a total of 20 commands you can use to build up a path, and they can used in almost any combination to create as complex a figure as you want.

// Shaper contour Commands
// Origin from SVG path commands
// M - move       (M, m)
// L - line       (L, l, H, h, V, v)
// C - cubic      (C, c, S, s)
// Q - quad       (Q, q, T, t)
// A - ellipticArc  (A, a)
// Z - close        (Z, z)
enum class SegmentCommand : uint8_t
{
    INVALID = 0
    , MoveTo = 'M'
    , MoveBy = 'm'
    , LineTo = 'L'
    , LineBy = 'l'
    , HLineTo = 'H'
    , HLineBy = 'h'
    , VLineTo = 'V'
    , VLineBy = 'v'
    , CubicTo = 'C'
    , CubicBy = 'c'
    , SCubicTo = 'S'
    , SCubicBy = 's'
    , QuadTo = 'Q'
    , QuadBy = 'q'
    , SQuadTo = 'T'
    , SQuadBy = 't'
    , ArcTo = 'A'
    , ArcBy = 'a'
    , CloseTo = 'Z'
    , CloseBy = 'z'
};

A single path has a ‘d’ attribute, which contains a series of these commands strung together. It’s a very compact description for geometry. A single path can be used to generate something quite complex.

With the exception of the blue text, that entire image is generated with a single path element. Quite complex indeed.

Being able to parse the ‘d’ attribute of the path element is super critical to our success in ultimately rendering SVG. There are a few design goals we have in doing this.

  • Be as fast as possible

  • Be as memory efficient as possible

  • Do not introduce intermediate forms if possible

No big deal right? Well, as luck would have it, or rather by design, the blend2d library has an object, BLPath, which was designed for exactly this task. You can checkout the API documentation if you want to look at the details, but it essentially has all those ‘moveTo’, ‘lineTo’, etc, and a whole lot more. It only implements the ‘to’ forms, and not the ‘by’ forms, but it’s easy to get the last vertex and implement the ‘by’ forms ourselves, which we’ll do.

So, our implementation strategy will be to read a command, and read enough numbers to make a call to a BLPath object to actually create the geometry. The entirety of the code is roughly 500 lines, and most of it is boilerplate, so I won’t bother listing it all here, but you can check it out online in the parseblpath.h file.

Let’s look at a little snippet of our house path, and see what it’s doing.

M20.658,9.26l0.006-0.007l-9-8

It is hard to see in this way, so let me write it another way.

M 20.658, 9.26
l 0.006, -0.007
l -9, -8

Said as a series of instructions (and it’s hard to tell between ‘one’ and ‘el’), it would be:

Move to 20.658, 9.26
Line by 0.006, -0.007
Line by -9, -8

If I were to do it as code in blend2d, it would be

BLPath path{};
BLPoint lastPt{};
 
path.moveTo(20.658, 9.26);
path.getLastVertex(&lastPt);
 
path.lineTo(lastPt.x + 0.006, lastPt.y+ -0.007);
path.getLastVertex(&lastPt);
 
path.lineTo(lastPt.x + -9, lastPt.y + -8);

So, our coding task is to get from that cryptic ‘d’ attribute to the code connecting to the BLPath object. Let’s get started.

The first thing we’re going to need is a main routine that drives the process.

tatic bool parsePath(const ByteSpan& inSpan, BLPath& apath)
{
    // Use a ByteSpan as a cursor on the input
    ByteSpan s = inSpan;
    SegmentCommand currentCommand = SegmentCommand::INVALID;
    int iteration = 0;
 
    while (s)
    {
        // ignore leading whitespace
        s = chunk_ltrim(s, whitespaceChars);
 
        // If we've gotten to the end, we're done
        // so just return
        if (!s)
            break;
 
        if (commandChars[*s])
        {
            // we have a command
            currentCommand = SegmentCommand(*s);
             
            iteration = 0;
            s++;
        }
 
        // Use parseMap to dispatch to the appropriate
        // parse function
        if (!parseMap[currentCommand](s, apath, iteration))
            return false;
 
 
    }
 
    return true;
}

Takes a ByteSpan and a reference to a BLPath object, and returns ‘true’ if successful, ‘false’ otherwise. There are design choices to be made at every step of course. Why did I pass in a reference to a BLPath, instead of just constructing one inside the routine, and handing it back? Well, because this way, I allow something else to decide where the memory is allocated. This way also allows you to build upon an existing path if you want.

Second choice is, why a const ByteSpan? That’s a harder one. This allows a greater number of choices in terms of where the ByteSpan is coming from, such as you might have been passed a const span to begin with. But mainly it’s a contract that says “this routine will not alter the span.

OK, so following along, we make a reference to the input span, which does NOT copy anything, just sets up a couple of pointers. Then we use this ‘s’ span to do our movement. The main ‘while’ starts with a ‘trim’. XML, and thus SVG, are full of optional whitespace. I can say that for almost every routine, the first thing you want to do is eliminate whitespace. the ‘chunk_ltrim()’ function is very short and efficient, so liberal usage of that is a good thing.

Now we’re sitting at the ‘M’, so we first check to see if it’s one of our command characters. If it is, then we use it as our current command, and advance our pointer. The ‘iteration = 0’ is only useful for the Move commands, but we need that, as we’ll soon see.

Last, we have that cryptic function call thing

if (!parseMap[currentCommand](s, apath, iteration))
    return false;

All set! Easy peasy, our task is done here…

That last little bit of function call trickery is using a dispatch table to make a call to a function. So let’s look at the dispatch table.

// A dispatch std::map that matches the command character to the
// appropriate parse function
static std::map<SegmentCommand, std::function<bool(ByteSpan&, BLPath&, int&)>> parseMap = {
    {SegmentCommand::MoveTo, parseMoveTo},
    {SegmentCommand::MoveBy, parseMoveBy},
    {SegmentCommand::LineTo, parseLineTo},
    {SegmentCommand::LineBy, parseLineBy},
    {SegmentCommand::HLineTo, parseHLineTo},
    {SegmentCommand::HLineBy, parseHLineBy},
    {SegmentCommand::VLineTo, parseVLineTo},
    {SegmentCommand::VLineBy, parseVLineBy},
    {SegmentCommand::CubicTo, parseCubicTo},
    {SegmentCommand::CubicBy, parseCubicBy},
    {SegmentCommand::SCubicTo, parseSmoothCubicTo},
    {SegmentCommand::SCubicBy, parseSmoothCubyBy},
    {SegmentCommand::QuadTo, parseCubicTo},
    {SegmentCommand::QuadBy, parseCubicBy},
    {SegmentCommand::SQuadTo, parseSmoothCubicTo},
    {SegmentCommand::SQuadBy, parseSmoothCubyBy},
    {SegmentCommand::ArcTo, parseArcTo},
    {SegmentCommand::ArcBy, parseArcBy},
    {SegmentCommand::CloseTo, parseClose},
    {SegmentCommand::CloseBy, parseClose}
};

Dispatch tables are the modern day C++ equivalent of the giant switch statement typically found in such programs. I actually started with the giant switch statement, then said to myself, “why don’t I just use a dispatch table”. They are functionally equivalent. In this case, we have a std::map, which uses a single SegmentCommand as the key. Each element is tied to a function, that takes the same set of parameters, namely a ByteSpan, a BLPath, and an int. As you can see, there is a function for each of our 20 commands.

I won’t go into every single one of those 20 commands, but looking at a couple will be instructive. Let’s start with the MoveTo

static bool parseMoveTo(ByteSpan& s, BLPath& apath, int& iteration)
{
    double x{ 0 };
    double y{ 0 };
 
    if (!parseNextNumber(s, x))
        return false;
    if (!parseNextNumber(s, y))
        return false;
 
    if (iteration == 0)
        apath.moveTo(x, y);
    else
        apath.lineTo(x, y);
 
    iteration++;
 
    return true;
}

This has a few objectives.

  • Parse a couple of numbers

  • Call the appropriate function on the BLPath object

  • Increment the ‘iteration’ parameter

  • Advance the pointer, indicating how much we’ve consumed

  • Return false on failure, true on success

This pattern is repeated for every other of the 19 functions. One thing to know about all the commands, and why the main loop is structured the way it is, you can have multiple sets of numbers after the initial set. In the case of MoveTo, the following is a valid input stream.

	
M 0,0 20,20 30,30 40,40

The way you treat it, in the case of MoveTo, is the initial numbers set an origin (0,0), all subsequent number pairs are implied LineTo commands. That’s why we need to know the iteration. If the iteration is ‘0’, then we need to call moveTo on the BLPath object. If the iteration is greater than 0, then we need to call lineTo on the BLPath. All the commands behave in a similar fashion, except they don’t change based on the iteration number.

Well gee whiz, that seems pretty simple and straightforward. Don’t know what all the fuss is about. Hidden within the parseMoveTo() is parseNextNumber(), so let’s take a look at that as this is where all the bugs can be found.

// Consume the next number off the front of the chunk
// modifying the input chunk to advance past the  number
// we removed.
// Return true if we found a number, false otherwise
        static inline bool parseNextNumber(ByteSpan& s, double& outNumber)
        {
            static charset whitespaceChars(",\t\n\f\r ");          // whitespace found in paths
 
            // clear up leading whitespace, including ','
            s = chunk_ltrim(s, whitespaceChars);
 
            ByteSpan numChunk{};
            s = scanNumber(s, numChunk);
 
            if (!numChunk)
                return false;
 
            outNumber = chunk_to_double(numChunk);
 
            return true;
        }

The comment gives you the flavor of it. Again, we start with trimming ‘whitespace’, before doing anything. This is very important. In the case of these numbers, ‘whitespace’ not only includes the typical 0x20 TAB, etc, but also the COMMA (‘,’) character. “M20,20” and “M20 20” and “M 20 20” and “M 20, 20” and even “M,20,20” are all equivalent. So, if you’re going to be parsing numbers in a sequence, you’re going to have to deal with all those cases. The easiest thing to do is trim whitespace before you start. I will point out the convenience of the charset construction. Super easy.

We trim the whitespace off the front, then call ‘scanNumber()’. That’s another workhourse routine, which is worth looking into, but I won’t put the code here. You can find it in the bspanutil.h file. I will put the comment associated with the code here though, as it’s informative.

// Parse a number which may have units after it
//   1.2em
// -1.0E2em
// 2.34ex
// -2.34e3M10,20
// 
// By the end of this routine, the numchunk represents the range of the 
// captured number.
// 
// The returned chunk represents what comes next, and can be used
// to continue scanning the original inChunk
//
// Note:  We assume here that the inChunk is already positioned at the start
// of a number (including +/- sign), with no leading whitespace

This is probably the most singularly important routine in the whole library. It has the big task of figuring out numbers from a stream of characters. Those numbers, as you can see from the examples, come in many different forms, and things can get confusing. Here’s another example of a sequence of characters it needs to be able to figure out: “M-1.7-82L.92 27”. You save yourself a ton of time, headache, and heartburn by getting this right.

The next choice you make is how to convert from the number that we scanned (it’s still just a stream of ASCII characters) into an actual ‘double’. This is the point where most programmers might throw up their hands and reach for their trusty ‘strtod’ or ye olde ‘atof’, or even ‘sprintf’. There’s a whole science to this, just know that strtod() is not your friend, and for something you ‘ll be doing millions of times, it’s worth investigating some alternatives. I highly recommend reading the code for fast_double_parser. If you want to examine what I do, checkout the chunk_to_double() routine within the bspanutil.h file.

We’re getting pretty far into the weeds down here, so let’s look at one more function, the LineTo

static bool parseLineTo(ByteSpan& s, BLPath& apath, int& iteration)
{
    double x{ 0 };
    double y{ 0 };
 
    if (!parseNextNumber(s, x))
        return false;
    if (!parseNextNumber(s, y))
        return false;
 
    apath.lineTo(x, y);
 
    iteration++;
 
    return true;
}

Same as MoveTo, parse a couple of numbers, apply them to the right function on the path object, return true or false. Just do the same thing 18 more times for the other functions, and you’ve got your path ‘parser’.

To recap, parsing the ‘d’ parameter is one of the most important parts of any SVG parser. In this case, we want to get from the text to an actual object we can render, as quickly as possible. A BLPath alone is not enough to create great images, we still have a long way to go until we start seeing pretty pictures on the screen. Parsing the path is critical to getting there though. This is where you could waste tons of time and memory, so it’s worth considering the options carefully. In this case, we’ve chosen to represent the path in memory using a data structure that can be a part of a graphic elements tree, as well as being handed to the drawing engine directly, without having to transform it once again before actually drawing.

There you have it. One step closer to our beautiful images.

Next time around, we need to look at what kind of Document Object Model (DOM) we want to construct, and how our SVG parser will construct it.

Previous
Previous

SVG From the Ground Up — Can you imaging that?

Next
Next

SVG From the Ground Up — It’s XML, How hard could it be?