Map Simplification with Visvalingam's Algorithm

uuilly · on June 2, 2012

I wrote an implementation of this for this project: http://vimeo.com/uuorld

It's MUCH harder if you want adjacent borders to line up w/o gaps. You have to find all shared verts, run the smoothing algo on them, do the same for the coastlines / islands and then put all the pieces back together. I think 10% of my code was cleaning dirty data (removing duplicate points, detecting and fixing self intersection) 80% of my code was breaking everything up and putting it back together and 10% was the Visvalingam / Whyatt algo.

The biggest time sink was dealing w/ messy data. Between countries, states, counties, zip codes, neighborhoods etc, we hit A LOT of crappy corner cases. If anyone else is endeavoring to do the same, contact me and I can probably save you a week of work.

whiskers · on June 2, 2012

Willy, I've tried e-mailing you through your about.me site, but it seems to get stuck on "Sending..." (and showing the spinner).

The message I sent was:

"I'd be very interested in any data you might have of this type (to overlay on a 3D model of Earth I'm building for a flight tracking project).

What data do you have?"

If you could respond via the address in my profile I'd be most grateful!

uuilly · on June 2, 2012

Willy dot Pell at gmail dot com

mbostock · on June 1, 2012

The source code is here, if you're interested: http://bost.ocks.org/mike/simplify/simplify.js

I'll probably package it up as a Node module so that it's easy to enable dynamic simplification of GeoJSON files. (A more efficient representation of shapes would also be helpful, though I like the convenience of plain text.) Or possibly a D3 plugin; it might be useful to extend the d3.geo.path shape generator to support more efficient dynamic filtering of points.

hadley · on June 2, 2012

Have you experimented with topologically consistent simplification? In my experience, that's the hard part - otherwise when simplifying (e.g.) state borders, the borders no longer match up.

mbostock · on June 2, 2012

No, but that's next on my list! I'm familiar with at least one approach—identify borders shared by multiple features, simplify those independently, then stitch everything back together. I'm curious as to whether you can identify shared points and then use the average area of the associated triangles to prioritize removal.

cpher · on June 2, 2012

I would be impressed to see an implementation of this. Topology is absolutely critical for serious GIS applications. Of course, ESRI makes it looks easy within their software, but I haven't seen what a basic algorithm looks like for this task. Good luck!

regularfry · on June 2, 2012

It seems a superficially similar problem to 3d mesh reduction.

vog · on June 2, 2012

This is a great alternative to Ramer-Douglas-Peucker. However, it is not quite what I need for my application (where I'm still searching for a good algorithm, maybe I have to develop my own).

The problem is that the area always shrinks (for a convex curve). This is simply because all those polygon simplifiers work by selecting a subset of the points, and removing points (from a convex curve) means that you always "cut corners".

For my application, this is totally undesirable. I'm starting with a convex polygon, and the simplified polygon should fully enclose the original polygon. Thus, the simplified polygon has to consist of points outside the original polygon.

Is there any existing algorithm which accomplishes that? That is, which adds to the (convex) polygon rather than removing from it?

bazzargh · on June 2, 2012

Given segments AB BC CD such that the angle between AC and CD is less than 180 degrees, add the candidate point E where AB CD meet when extended. This is always outside a convex polygon. Choose the next candidate to minimize the area of BEC. The resulting polygon is also convex. Iterate.

Once you reach a rectangle or a triangle there will be no more candidates.

vog · on June 2, 2012

Nice to see that others have the same ideas! I also thought about that option at first, but this adds too much when the angle between AB and CD is small.

So my current approach is to take ABCDE and replace BC and CD with that of the (infinitely many) tangents on C where the resulting area is minimal. Then B' is the intersection of the ray AB with that tangent. D' is corresponding intersection with ray ED. ABCDE is then replaced with AB'D'E (i.e. B->B', C removed, D->D').

Although I didn't yet do the math to find the optimal tangent on each point, it shouldn't be too hard to find that one.

Note that the simple ABCD approach is a degenerate case from this new approach: You just always choose the tangent to be BC. So I expect the new approach to produce strictly better results.

However, is there anything more elaborate than that? I don't belive I'm the first one who formulates that problem...

bazzargh · on June 2, 2012

Yes, there is other work on this problem. My google-fu is strong.

http://archive.org/details/minimumareacircu00agga "We show that the smallest k-gon circumscribing a convex n-gon can be computed in O(n^2 log n log k) time" This is elaborate enough... the paper isn't written in anything like pseudocode.

I'm presuming that that's not the result referred to below though (it's different authors, but this was the same year as the paper above): http://www.sciencedirect.com/science/article/pii/0734189X859... "A recent paper by Dori and Ben-Bassat presented an algorithm for finding a minimal area k-gon circumscribing a given convex n-gon, where k < n. An infinite class of polygons for which their algorithm fails to find the minimal area circumscribing k-gon is presented."

http://www.springerlink.com/content/11373157378222m3/ "Given any plane strictly convex region K and any positive integer n >= 3, there exists an inscribed 2n-gon Q_2n and a circumscribed n-gon P_n such that Area(P_n)/Area(Q_2n) <= secant(pi/n)." (ie this is how much you expand the area when you halve the number of edges)

If nothing else, knowing that the words "minimum circumscribing polygons" is the right thing to google for is useful (not just minimum-area - I presume minimum-perimiter solutions are useful to you too)

vog · on June 4, 2012

Thanks for all those pointers! I think I was missing the keyword "circumscribing".

regularfry · on June 2, 2012

Invert your polygon (so all points at a distance `x` from the center get moved to a distance `1/x`), apply Visvalingam, then invert the result back.

vog · on June 2, 2012

I don't see how this would help. Visvalingam selects a subset of the inverted points, so when inverting back, I'll still end up with a subset of my original points. So the algorithm still cuts corners instead of adding areas.

regularfry · on June 3, 2012

If you add a midpoint along each edge Visvalingam creates, invert the reduced poly then remove the original points, you'll get a containing polygon. I think. But we're heading rapidly into hackish bodge territory :-)

bazzargh · on June 2, 2012

This doesn't work. Consider a polygon with all points on the unit circle - your transformation is the identity (both before and after Visvalingam's algorithm is applied).

regularfry · on June 3, 2012

True.

7952 · on June 2, 2012

Try adding addittional points or even smoothing before running the simplification algorithm.

andrewcooke · on June 2, 2012

can anyone clarify "There is no guarantee that removing a point increases the area of the adjacent triangles; thus, the algorithm considers the effective area as the larger of the associated triangle and the previously-removed triangle" - it does't appear in the original paper (afaict) which seems to emphasise that the method is point-based (implying no such "memory"). and i can't see why it's needed.

it's clearly there in the source, but i don't understand the comment there, either. what am i missing?! (i understand the lack of a guarantee of increasing area - what i don't understand why the algorithm should "care"; once the point is gone, it is gone...).

mbostock · on June 2, 2012

The line from the original paper is in section 3.3: "If its calculated area is less than that of the last point to be eliminated, use the latter's area instead. (This ensures that the current point cannot be eliminated without eliminating previously eliminated points.)"

andrewcooke · on June 2, 2012

ah, thanks. and i think i understand now - it doesn't affect filtering, but it makes the precalculated list have a monotonic value that can be used to select a "degree of smoothing". maybe.

davidjohnstone · on June 2, 2012

I've had an interest in polyline simplification for the last couple of weeks, and I eventually stumbled upon the Ramer–Douglas–Peucker algorithm and a JavaScript [1] and Python [2] implementation of it.

However, one thing it doesn't do (and Visvalingam’s algorithm doesn't either) is remove loops. In many cases you wouldn't want that, but sometimes you just want to draw a line that looks right, and being able to remove redundant parts of the polyline would be very helpful. For example, in the case where you have a GPS track of a run where you run to a nearby oval, do ten laps, and then run home, and you want the simplest polyline that looks right.

1: https://github.com/mourner/simplify-js 2: https://github.com/omarestrella/simplify.py/

ubasu · on June 2, 2012

Of course, I am sure many realise that this is applicable to visualising anything of a fractal nature, e.g. plotting stock price history at various zoom levels.

grandalf · on June 1, 2012

Where's Michigan? I'd be curious how the algorithm handled a shape like that.

mbostock · on June 2, 2012

These are cultural vectors, and so the sovereign territory includes the Great Lakes. Though, come to think of it, I should have picked the physical vectors if I wanted to stick to the coastline paradox theme!

grandalf · on June 2, 2012

Interesting. I'll have to read more about the nuances of the algorithm.

wcarey · on June 2, 2012

This is a surprisingly different algorithm from Douglas-Ramer-Peucker. Visvalingam's algorithm gradually removes detail, while Peucker gradually restores removed detail.

I'm really curious, too, how Visvalingam performs when the variance of the heights of the triangles is large. Intuitively, it seems like it'd be much more aggressive than Peucker about removing spikes in data, which, depending on what you were aiming at, could be good or bad. Imagine processing a polyline whose coordinates are:

[0,0],[4,2],[8,0],[9,-9],[10,0],[14,1],[18,0]

Peucker would preserve [9,-9] until very last. It seems like Visvalingam would remove that relatively early.

edit: Either way, it's a nice algorithm to add to the quiver.

JonnieCache · on June 2, 2012

This is used to great effect with the Alchemy software synthesizer's "Detail" knob, allowing you to edit sounds with surgical precision or to make broad strokes. Hand-editing additive sounds is almost impossible without it.

Alchemy in general is a jaw dropping piece of kit. It would give the UX types here a seizure though...

tshadwell · on June 2, 2012

It is strange to try to notice the subtle changes of an algorithm that tries to minimise visual change as much as possible.