Primitive Image: Approximating Images with Basic Shapes

Sunday, March 10, 2019 - 7 mins

Before you start, head to my gallery of photos from China, press ‘g’ and click on some images. These are some examples of “primitive images”.

Scalable Vector Graphic (SVG) is a vector image format, meaning that the picture is described by a series of lines, paths, and other features. This is usually good for graphics, not photographs. However, a while ago I stumbled upon a tool called Primitive, written in Go, which approximated photographs using hundreds or thousands of primitive shapes - triangles, rectangles, ellipses, and bezier paths. This results in pretty cool, abstract representations of the original.

While they are art in their own right, these primitive representations can also be useful when displaying a large number of photographs online, especially large ones. Photos, even in JPEG form, are large and can take a while to load, which can frustrate users. One solution to that problem is to load a placeholder, which is what I used the original Primitive program for when I built the gallery. When you click on a thumbnail there (press ‘g’ again if you already did), you’ll probably notice what looks like a blurred version of that picture for a second or two. As described in my previous post, I generated a 100 shape SVG approximation of each image, added a blur filter to it, and had the page put the SVG in the real image’s place until it can load. The ‘g’ command I described at the beginning turns off the blur and does’t load the original image (just for fun, I think they look cool).

After I used the original Primitive program for these images, I decided that I wanted to understand how it worked, wanted to practice my Rust skills, and perhaps most importantly, had too much free time. If you don’t want to read this whole thing, the code is available on Github, including executables for Windows, Linux, and MacOS, so you can mess around with it yourself. If you do, please share your results (and of course, any bugs)!

I started by focusing on triangles. They seemed like the simplest shape to recreate, and I really like the output from the original when it used them. This is the image I used for most of my early tests:

And this is the output using 100 triangles from the original program:

I started with a basic single file Rust program, which I don’t have a copy of anymore (I didn’t have it on Git yet unfortunately). The general concept is fairly simple, but as always, the devil is in the details. The basic concept is that you generate a random shape, add it to the approximation, and see if it made an improvement. Like the original, I used the root-mean-square error (RMSE) to compare the original and the approximation. This is essentially taking the square root of the average of the squared difference between each corresponding pixel. Anyways, if a shape reduced the RMSE, it was considered and improvement and kept.

After a certain number of mutations that fail to improve on a shape, the shape is added to the final approximation. That process is repeated until you get to the number of shapes the user specified.

My first attempt ended up looking like this:

You can see it clearly has the darker region that it’s supposed to have, but that’s about it. Other than the fairly obvious color issues, there’s the problem of the border areas not being properly covered. Some of the early mistakes I made include:

Attempting to mutate the color of the triangles. It turns out that what the original project did is flat out better, which is using the average color of the original image in the location of the triangle.
Not allowing the vertices of triangles to extend beyond the edges of the image. This relatively simple change resulted in much better results.
Efficiency. Not really one single mistake, but the thing took a while to run.

I eventually fixed the first two, mostly. After these changes and other tweaks, I started getting results like this after 100 triangles:

The same run after nearly 1,000 triangles:

While the end result is a decent approximation of the original image, it used far more images to get there. The triangles were rather small and sparse and I it took overnight to generate it.

After this, I messed with Python for a bit, thinking that OpenCV and Numpy might be able to help, which they did in their own way. I realized that processing the full image took way to long, especially on the RMSE step, which would run at least 10,000 times per image (100 shapes, 100+ mutations per shape). So I went back to Rust and had it automatically shrink the input image down to 100 pixel on the largest dimension. This really sped it up, which allowed me to tweak the randomization parameters better. Eventually, I started getting results like this:

I worked on rewriting the Rust program into a better organized mess, which included providing an opportunity for other shapes, which I did end up adding.

I started with bezier curves (basically curves defined by, in my case, 3 or 4 points). These were tough and still aren’t exactly how I wanted them to end up. Using various resources (of which this was by far the most useful: A Primer on Bezier Curves), I learned that the curves are basically parametric equations (for SVGs, either quadratic or cubic in order) from t=0 to t=1. The hard part with them compared to triangles is that it is harder to define which pixels are on the line. The original project only used quadratic curves (1 control point), but because of the image processing library I used, I started with cubic curves (2 control points), before adding quadratics). Long story short, I got mixed results. Some of the time, they turned out okay, like this cubic bezier (I think 1,000 of them) approximation:

Most of the time however, they either looked like I embedded a toddler in my code, or, when combined with triangles, like a toddler kept interrupting:

Both of these were trying to approximate the same image. For comparison, here’s the same image approximated using Bezier curves with the original project:

With these mixed results, I moved on to rectangles and ellipses, which after some struggles with the rotational aspect, turned out fairly well. Here are some of these, and plenty of others too:

What I Learned

Language

I still like Rust! This is only my second project with it, but I’m already feeling pretty comfortable with the language. The only real downside seems to be the lack of libraries available for it. For example, making a cross platform GUI for this project would be fairly difficult right now. But other than that, it has really worked out well. There have been several times when its borrow checker has saved me from myself, and the fact that it doesn’t allow integer overflows by defaults definitely helped on numerous occasions. I do feel like my code is a bit cluttered from all of the type casting, but overall, I think it’s a good language.

Building and Releasing

Towards the end of the project, I decided I wanted to try and build binaries for other platforms - supposed that’s something Rust is good at. I tried without success to build a Linux binary on my Windows machine before discovering that Travis CI can build for all three major OSs and release them via Github’s Releases feature. After pretty much an entire day, I managed to learn quite a bit about Travis CI (and some more about Git along the way). The result is binaries that everyone should be able to use to mess with this project without installing Rust!