Handmade pathfinding mesh for games

February 23, 2025

Last year I posted about how I implemented Sloan's Fast CDT (Constrained Delaunay Triangulation) from a computer science paper. I used it for experiments with pathfinding but the system I achieved wouldn't scale. It would only be able to take a list of points and constraints and turn it into triangulation mesh, all at once or nothing. This wouldn't be suitable for dynamic pathfinding systems where obstacles are created and removed.

Early this year I spent quite a bit of time thinking how to slice the original algorithm and invent a separate addPoint() and removePoint() operations that would allow me to keep and maintain a mesh with minimal changes, without having to re-do it on every update.

When I finally did some online research, I found out a good and a bad news. Good one was that if I worked on this 25 years ago it would qualify for state-of-the-art computational geometry research, as I accidentally connected the right puzzles and rediscovered some cool techniques. The bad news was that I didn't really have to go through all of this myself as Kallmann et al. solved this problem in 2003 and their approach is used to this day.

Anyway, I proceeded to implement all the dynamic CDT algorithms and this blog post is an explanation of how they work and how to effectively implement them.

As for the title, it's a bit of a bait. This blog post discusses how to maintain a dynamic triangulation that span all scene obstacles, which is arguably the hard part. To see more about pathfinding itself, see another of my blog posts Open Problems of A* for Any Angle Pathfinding.

Why do we even need triangulations for pathfinding

First, what is a triangulation? A mesh consisting of triangles. For games with dynamic maps we need a way to generate a graph that will be used by pathfinding algorithm. Triangulations work great for this use case because there's a lot of research and many algorithms for generating them from points.

Why DT (Delaunay triangulation)? It has a nice property: it globally minimizes minimum angle in the triangles. In other words: we want to avoid having long narrow triangles if possible. It's not only better looking but leads to better navigation graphs.

Triangulation is Delaunay if for every triangle, a circumcircle of it does not contain any other points. There is mathematical proof that Delaunay triangulations exist for every set of points on a plane (provided that they are not all collinear).

What does constrained mean? We wouldn't need pathfinding if there weren't any obstacles. Constrained simply means that we expect some edges to exist in the triangulation and be marked as constrained (word fixed is also used). This is essential to make the mesh useful for us.

At this point you might ask yourself, if we force some edges to be in trianglation, will it always be fully Delaunay? Well, most often – absolutely not. In fact, many systems, for various reasons, introduce points only as parts of constrained edges. Which means that if there are any 'nice' triangles, it's in between constrained shapes. Also, there are some interesting properties of Delaunay triangulations that make some things easier. So keep in mind that they don't apply anymore to what we have (but some still might).

How is it useful for pathfinding? That's a completely different problem. Usually people use centroids (centers of triangles) or middle points of edges and connect them into a navigation graph. It often involves some additional tweaking as the triangulation mesh is not perfect representation of the walkable surface. But that's beyond the scope of this article.

Algorithms explained

Now I will go over algorithms I settled on. This is mostly a mix of work of Kallmann, Shewchuk and prior papers from early 2000s.

Again, the goal was to create a system for dynamic CDT that is possible to maintain as the triangulation grows considerably. Constant time O(1)O(1) execution is unlikely to be possible but avoiding O(n)O(n) is key to prevent the system struggling as the mesh grows.

Half-edge

I found half-edge data structure to be incredibly versatile and helpful for implementing all algorithms for this problem. Half-edge combines origin point, pointer to next half-edge, pointer to twin half-edge and information whether it is fixed (constrained). Traditionally it also points to face which is the adjacent polygon, but in my case I don't need to store any additional information in triangles so I skipped it.

I use additional assumptions which help greatly with mesh traversal.

  • All edges belong to triangles, meaning edge.next.next.next == edge.
  • Only outer boundary edges don't have twins (which means twin relationship has to always be kept up-to-date).
  • All triangles are counterclockwise (NOTE: this does not matter for anything as long as you are consistent. Depending on handedness of your coordinate system and whether your Y axis is going up or down it will be flipped).

Mesh

Since I am trying to represent RTS-like maps, I will be operating in a fixed rectangle. All inserted points must be in the pre-defined rectangle.

Insert point

There are four possible outcomes when adding point to triangulation.

  1. Point is not in any triangle. We don't care about this as we only allow points within the defined rectangle.
  2. Point is inside a triangle. We need to split it into 3 triangles.
  3. Point is precisely on an edge. We need to split the edge into two which turns 2 triangles into 4. If the edge was outer boundary of the mesh, we turn 1 triangle into 2.
  4. Point is on another point. We don't need to insert a point that already exists. It is safe to skip.

After inserting the point, I run a very common procedure to improve triangulation. I add outer edges of this triangle to a stack and check if they break Delaunay property. If yes, I flip them and add two other edges of the post-flip triangles to the stack. This might sometimes cause a wave of flips going outwards from the inserted point, but in practices it is very often limited to just several flips which greatly improve quality of the triangulation.

Enforce edge

As I mentioned before, the DT triangulation itself optimizes for angle properties of the triangles. But in practice, we need specific edges to exist in triangulation. We can do that by enforcing them.

NOTE

First, it is worth noting that quite often the initial DT triangulation produces desired edges and there's no need to enforce them. We need to detect them and mark as fixed. In those cases we don't need to do anything else.

To enforce an edge we need to get rid of all other edges which intersect it, have an edge connecting two points that we want it to, and then mark it as fixed.

There's a brilliant algorithm which accomplishes it by creating a queue of intersecting edges. If given edge makes a convex quad, we can flip it. If not, goes back to the queue (if you think about it, if we tried to flip an edge in non-convex quad, it would go outside it, breaking the triangulation).

By flipping other edges, each non-convex quad will eventually become convex and it will be possible to do the flip. I don't remember exactly if I saw a proof of this always working but in practice I haven't encountered situation where it wouldn't. So while I remain a tiny bit skeptical, I think it works.

Very often the desired edge is the last one to be flipped so after every flip I do a check if flipped edge is the one we were looking for.

Kallmann uses more complex method where they remove all intersecting edges and fill the gaps back in. They mention the edge flipping method as a potential for future exploration. Well, I did and it feels so much easier I am not even considering implementing the original method.

Remove point

To remove a point we need to remove all edges connected to that point. We also collect a chain of edges that formed a boundary of the star-shaped hole left by removed point. Then we proceed to fill the gap by using ear clipping algorithm. It will run in O(k2)O(k^2) based on the number of edges we have to deal with.

NOTE

Points on the outer boundary of the mesh should be removed if they are collinear with neighboring points and don't belong to an enforced edge.

Collecting boundary is a step where the half-edge data structure will be particularly useful. We can use it to carefully loop around a point.

We need to find an edge which starts at this point. Then we need to search, either CCW or CW until we do the full loop or encounter the outer boundary. We will recognize it by the missing twin. In that case, we cannot continue our traversal but we have no guarantee that we collected all edges. For that we start another loop, going in the opposite direction and looking to hit boundary again.

Finally, we need to destroy edges that were going in and out of the point being removed.

My full implementation of this looks like this:

pub fn collectBoundary(edges: *EdgeContext, boundary: *Ring, p: Point) !void {
    const e = locatePoint(p, try edges.any()) orelse return error.EdgeNotFound;
    const start = getVertex(p, e) orelse return error.NotVertex;
    var current: ?*HalfEdge = start;

    // Unless it's extremely degenerate case, it should never need 
    // more iterations. 
    const LIMIT = 128;
    var i: usize = 0;
    var continue_cw = false;
    var to_destroy = Stack{};

    // Search CCW.
    while (i < LIMIT) : (i += 1) {
        _ = try boundary.append(current.?.next.?);

        try to_destroy.push(current.?);
        try to_destroy.push(current.?.next.?.next.?);

        // Reached boundary, switch to CW search.
        if (current.?.next.?.next.?.twin == null) {
            _ = try boundary.append(current.?.next.?.next.?);
            continue_cw = true;
            break;
        }

        current = current.?.next.?.next.?.twin.?;
        if (current == start) break; // We completed a full loop.
    }

    if (continue_cw) {
        if (start.twin == null) _ = try boundary.prepend(start);

        current = start.twin;
        while (current != null) : (i += 1) {
            assert(i < LIMIT);
            _ = try boundary.prepend(current.?.next.?.next.?);

            try to_destroy.push(current.?);
            try to_destroy.push(current.?.next.?);

            // Reached boundary.
            if (current.?.next.?.twin == null) {
                _ = try boundary.prepend(current.?.next.?);
                break;
            }
            current = current.?.next.?.twin.?;
        }
    }

    // Only destroy edges with twins (no twin means it's boundary).
    while (to_destroy.pop()) |edge| {
        if (edge.twin) |_| {
            edges.destroy(edge);
        }
    }
}

Walking algorithm

There are several operations where we need to locate a triangle. We need it to insert or remove a point.

The brute-force way is to search in O(n)O(n), which unfortunately makes creation of a mesh of nn points O(n2)O(n^2). It is possible to use spatial search with a QuadTree or R-tree to bring down search to O(log(n))O(\log(n)) which might be the fastest method. I initially tried going this direction but found both R-trees and QuadTrees to be hard to apply to this problem. The former had their effectiveness highly dependent on the insertion order. The latter turned out to produce very shallow trees as often edges were in between quadrant and thus led to crowded nodes impossible to subdivide. On top of that there was additional memory cost and complexity of maintaining another data structure.

But there's also another way: a so called jump-walk algorithm. We need to start in any triangle and just walk our way towards the point, picking the edge to cross by analyzing if the points is in the half-plane formed by that edge (there's more than one method for picking the edge to cross but this seems to be the most common). If we step into triangle that contains the point we are looking for – we are done.

I found a claim in Kallmann's paper that the jump-walk algorithm runs in expected O(n1/3)O(n^{1/3}) time. I haven't seen the proof of it and it's hard to guess how exactly cube root has anything to do there but in practice it runs impressively fast.

Finding intersecting edges

Here again I tried using spatial data structures until it struck me that it is also possible to use a variation of walking algorithm to follow the edge and collect points on the way. This is more involved but still practical.

  1. Given a seed edge, a queue to put intersecting edges on, and e1, e2 points, I find triangle with point e1.
  2. From there find an edge that is starting in e1.
  3. Then find the one where segment e1-e2 crosses one of its edges.
  4. Execute jump-walk algorithm, adding edges that cross e1-e2 to the queue.

It might sound a bit complex (and it is) but this search in nature is very local and not depending on the size of the whole triangulation, making execution really fast.

Fixed data structures

While majority of the speed gain on larger meshes I observed came from computational optimizations (the search algorithms mentioned above), I decided to try going for maximum performance and rewrote the whole thing in Zig with big focus on avoiding dynamic memory allocations. Instead of using a dynamic ArrayList to store pointers to half-edges intersecting an edge, I asked myself a question: how many edges can possibly already be under an edge that I might want to insert? I don't know, 256?

Then I proceeded to implement a static queue working on 256-element array. Similarly I implemented a ring structure for boundary of star-shaped polygon after point removal and a stack for flipping non-Delaunay edges.

pub fn StaticStack(comptime T: type, comptime capacity: usize) type {
    return struct {
        const Self = @This();
        values: [capacity]T = undefined,
        size: usize = 0,

        pub fn push(stack: *Self, value: T) !void {
            if (stack.size == capacity) return error.OutOfMemory;
            stack.values[stack.size] = value;
            stack.size += 1;
        }

        pub fn pop(stack: *Self) ?T {
            if (stack.size == 0) return null;
            stack.size -= 1;
            return stack.values[stack.size];
        }
    };
}

Full implementation of the stack to demonstrate how simple it is.

Custom memory pool

When dealing with only one type of object which is often created and removed a go-to memory allocation pattern is a pool. There is already a memory pool in Zig but it doesn't support iteration/retrieval of all objects so I had to write my own. First I wondered if it would be possible to just use an array, but the problem is that there are unknown other objects that point to give half-edge as next or twin and there wouldn't be any fast way to update those pointers if I wanted to do a classic swap remove (remove an object and put the last in array in its place to keep the array contiguos).

I asked myself: what operations do I need?

  • any() – for finding seed triangle for walk-search.
  • all() – for exporting all points for pathfinding/rendering.
  • create() – add a new half-edge.
  • destroy() – remove a half-edge.

There's a very handy module in Zig called BitSet. It uses an array of u64 integers and treats each value as a bit flag. Since every integer holds 64 values and CPUs have fast operations for adding bits in an int or comparing with a mask, it's like SIMD on steroids with 16x more concurrent checks.

I used BitSet to mark all free elements in a static array. BitSet.findFirstSet() method would help me find a spot to place a new item when calling .create().

pub fn create(pool: *Self, value: T) !*T {
    const index = pool.free.findFirstSet() orelse return error.OutOfMemory;
    pool.free.unset(index);
    pool.items[index] = value;
    return &pool.items[index];
}

A consequence of this design is that free gaps closest to the beginning of the array will be first to be filled. Unlike in Zig's memory pool using linked list, where allocations will go in order. I am not sure though if there's any practical meaning of this.

Stunning results

The mix of methods I described led to extremely fast execution times. There is no need ever for my code to ask operating system for more memory. All data is co-located and fits in the CPU cache. All of it together makes it extremely easy for CPU to just do its work and run this code in the insane speed of modern CPU clock speed.

But how fast are we talking? While working on this problem I made several implementations and the difference was staggering.

I used a 100x100 grid with 3000 tiny 1x1 squares inserted into it (30% of space used) as a benchmark.

  • Brute-force JS algorithm – 300s.
  • Slightly optimized JS – 30s.
  • Optimized JS with Walk algorithm for searching points and QuadTree for finding intersecting edges – 900ms (30x faster).
  • Zig implementation using Walk algorithms and no dynamic memory allocations – 25ms (30x faster again).

Brute-force variant just means that whenever I had to search for triangle containing a point or edges crossing an edge I would do O(n)O(n) search.

Since there's a difference in computational complexity I could make this difference much bigger and pump the numbers as high as I want but 12,000 points is roughly a good measurement for maximum number of points I could need in a StarCraft-like map.

I find this difference quite incredible. The brute-force algorithm took 25ms per point, while the highly optimized version completed the whole mesh in that time, 2µs per point. This takes it from something that wouldn't be useful at all in a game needing to complete a frame simulation within 16.6ms, to something almost free.

Edge cases

When you are working on computational geometry problem using half-edge data structure, every problem is an edge case (I am sorry). And yet, there are some extreme situations we need to consider:

  • Inserting a point inside a fixed edge. Since obstacles can often touch, it's important to handle situations a shorter edge neighbors a longer one.
  • Crossing fixed edges. Not necessarily common but map design might necessitate having two overlapping obstacles.
  • Adding and removing points on the boundary. I already mentioned this but it's again important to consider this and make sure we don't leave dead points (not used by any fixed edge) when removing from triangulation.
  • Unmarking long fixed edges. We need to be careful since there might be neighbors sharing part of a long edge, which is now comprised of several sub edges. How do we know which are now non-fixed and which remain?

It depends a lot on how you are going to use the mesh so I won't be going deeper into each of those problems. It often involves storing information outside of the triangulation about inserted chains of constrained edges. Kallmann et al. also suggest counting how many times each point is used by some constraints and removing it once this number drops to zero.

Summary

And that's it. This relatively short blogpost took a lot of time to make. Behind every illustration there were 5 sheets of paper filled with scribbled triangles while I was trying to understand those algorithms. I am very happy with the result. It's a very specific but complete project.

I made a system for dynamic insertion points, enforcing fixed edges and removal of points in a triangulation. Due to unusual properties of walk algorithms it's hard to calculate exact computational complexity of those algorithms, but in practice they run significantly faster than O(n)O(n) which would be the theoretical upper bound of visiting triangles. And that's exactly the initial goal.

If you ever thought about implementing your own pathfinding and navigation system from scratch I hope this write-up was very useful. It certainly would be for me a year ago. If you get stuck you can always reach out to me on BlueSky or X (links in footer).

You can visit /cdt for an interactive live demo.

NOTE

For demo purposes I am compiling WASM binary using ReleaseSmall preset in Zig. It's 2x slower but 40x smaller than ReleaseFast (582kB vs 14kB).

Example triangulation from the demo:

Useful reading

Stay up to date with a newsletter

Sometimes I write blogposts. It doesn’t happen very often or in regular intervals, so subscribing to my newsletter might come in handy if you enjoy what I am writing about.

Never any spam, unsubscribe at any time.