Notes from benchmarking WASM and optimized JS

One night I got a random thought: I never compared my fast CDT (Constrained Delaunay Triangulation) Zig algorithm (article) to some baseline JS implementation to see how much difference all the manual optimizations and assumptions actually made.

Fortunately, in late 2025 it’s easier than ever before: I just asked an LLM to generate a TypeScript version of my code. That approach turned to be surprisingly effective. With a proper test suite to also copy and compare, I was running working JS implementation within 30 minutes. While going from high to lower level languages requires making many decisions on things that otherwise would be dynamic, going the other way usually leads to straightforward, C-like looking JavaScript and that was the case here.

I did a bunch of optimizations and it got fast. Very fast. Too fast, I would say. In my tests it was even faster than running the Zig implementation in WASM. Could WASM overhead actually lose against the speed of straightforward C-style JS that avoids dynamic memory allocation?

I started digging deeper and here are my findings.

Optimizing JS

First, I will go through how I initially optimized the JS version.

For all benchmarks in this blog post I used the same example as rendered by default on /cdt. It is a very simple game-style map (combination of terrain obstacles and walkable areas) which matches the intended use cases I had in mind when designing the algorithm.

I tried robust-predicates – a JS port of the famous numerically stable library, but it was actually a 40% slowdown. Aas expected, since it performs much more math computations to ensure numerical stability.

Compared to the Zig version I was able to inline custom data structures, and that even caused performance gains. I ended up running all methods directly on numbers, passing around indices in arrays. No objects in sight.

A massive win came from inlining code assertions and null checks. I was transforming the -1 array index to null and checking for that value in an if. After switching to just using -1 I saw a 50% improvement.

At the start of optimizations (of already fast code operating on static arrays) the blessed example was executing at 4.5k runs/s. After it was 13k runs/s.

What if I went too far?

I started getting very confusing test results. Running in dev mode in browser, refreshing the page multiple times and eyeballing the ’average’ result – I started regularly seeing the JS version beat the WASM. I assumed it was the overall overhead of running WASM. This made me wonder: if WASM version was seemingly no better than heavily-optimized JS, what if… what if a JS version that doesn’t even try to be smart will also prove that there’s no benefit of the troublesome manual memory management?

I went back to the LLM of choice and asked for a version that doesn’t try to be too smart. Don’t skip using JS class instances in favor of breaking down into number type arguments, don’t make fancy manual memory management structure, just use new freely.

it was slower (fortunately, for my sanity), but only 20-30%. This was too much – I decided to do proper benchmarks.

In the end I found results in browser too unreliable and brittle. Plus they required manually refreshing the page which was not making me any faster.

Benchmark

I ended up writing a benchmark script that compares:

the smart/fast and simple versions
running JS vs WASM vs native Zig code

Fortunately, it was easy to run WASM in Bun in CLI now that it supports WASM execution.

Methodology

Machine: Macbook Air M3
Engines: Bun (JSC), Node (V8)
Iterations: 10,000 per variant (JS uses additional 1,000 runs of warmup before counting starts)
Compilation flags: Zig (both WASM and native) use -Doptimize=ReleaseFast
EdgeContext capacity: 3,200 – static arrays are set to handle at most this many edges

Each run builds the triangulation for the same input – the previously mentioned example map. It’s important to note that since algorithm is not linear, the numbers are ultimately made up – I could just keep making the example bigger and bigger to increase the difference. To ground it in reality, I opted for a realistic example.

I was measuring total clock time from start to finish of running algorithm operations - insertPoint() and enforceEdge(), omitting resets.

You can run it and compare yourself: GitHub.

Results

Here are the results:

Version	p50 (ms)	Speedup
Default JS (Bun/JSC)	0.097375	1.00x
Default JS (Node/V8)	0.082459	1.18x
Fast JS (Node/V8)	0.073584	1.32x
Fast JS (Bun/JSC)	0.058959	1.65x
WASM (Bun/JSC)	0.032292	3.02x
Zig Native	0.025291	3.85x

The results are (finally) matching reasonable expectations. Simple JS is slowest (although not massively). Interestingly there’s no clear winner in performance between V8 and JSC – reasonable, as they both have strengths and weaknesses. The difference between JS implementations could be explained by dynamic memory allocation and possibly more cache misses due to dynamically allocated objects instead of reads from the same array.

The WASM version is slower than native Zig. Partially due to overhead, partially because WASM, while uses static memory allocation from a single buffer, still runs in a VM and there’s some micro overhead I assume.

I did not measure startup overhead of WASM module. If the issue I optimized for was the total time from starting JS execution to rendering triangulation on the screen, WASM would likely lose.

Observations

I have a couple takeaways from the experiments.

Only comparable benchmarks that measure thousands of executions are reliable for small and fast code. All I got from wrapping single execution in performance.now() was confusion.

So many things impact execution. JS code warmup. The JS engine itself (V8 vs JSC). WASM overhead. Other things happening on the CPU. Background system load. Unless comparisons happen in the same environment, they are not worth much.

Optimizing JS to use statically allocated number arrays is not necessarily a great improvement. It did bring 20% positive change, but the code lost on readability quite a bit. There were a lot of optimizations that followed – switching away from null to -1, getting rid of objects. It brought measurable improvements to that implementation, but note that the simple version, where all optimizations of that kind were left up to the compiler, was just 15% slower.

I was also quite unsure about the impact of V8 doing its magic and swapping interpreted JS with JIT-compiled fast version for the hot paths – JS engines, particularly V8, is known for that, detect and optimize hot code after a few runs, meaning the early runs can be much slower. I tried looking into possible flags for Node but all I found was node --jitless option which was consistently making code 30x slower. At least there’s that.

I found that, expectedly, Node (V8) and Bun (modified JSC) differ in performance. Also there was no clear winner – simple version worked faster in Node but Bun gave a more impressive speed up to the ’faster’ one.

At the same time, all of this is just tweaking a code that is already very, very fast and based on mathematical research. See my previous articles about CDT [1][2] for comparisons with other methods.

Final thoughts

Finally, it’s always important to consider optimizations in the context where they are needed.

Quick reminder: after designing my first CDT algorithm, I realized that in practice, for my desired use case, I can’t afford recreating the pathfinding map from scratch; instead I need a dynamic system that supports adding points, removing them, and enforcing constrained edges (the ones that need to remain in the triangulation).

The example code I used creates 942 edges and enforces tens of fixed edges. Usually, in a single game update loop pass, there won’t be more than a couple enforced edges at once. Game update loop has a budget of 16.6 ms. The slowest version of the algorithm I have analyzed runs in 0.1 ms. Not for a single loop pass, but for recreating the whole system from scratch.

This means that in practice the operation does not fit 166x in the loop but probably well above 1000x. Which means that, unless I decide to write the whole game in Zig, the JS version is more than enough, thanks to other optimizations and the design of the algorithm.

Worth noting, this is not a typical web application example. It’s a high-performance game or simulation environment. None of the considerations would apply if I was writing a web server – I would be solving different problems with different priorities.

The Zig/WASM implementation helped me design CPU-friendly code that compilers can optimize aggressively, but I can ultimately stick with the JS version. That might look like a JavaScript win, but to me it’s mainly a win for careful algorithm/design work and for disciplined benchmarking – and asking whether optimizations actually make things faster or just make me feel better.