Making Our Own Tiny Google Maps

I wanted to write this post for a long time. In fact, I made a draft of it half year ago and never managed to publish it. But it finally did happen, and in this one, I will combine knowledge from literally every post I've written to this date on this blog. That's why I've posted those seemingly useless or unrelated tricks and algorithms (which you should definitely check out!).

What makes a map engine like Google Maps?

First, it involves the conversion of data coordinates using Web Mercator. Then data is divided into square tiles (which works perfectly with the fact that Mercator makes the world a giant square). Each tile can be assigned different data with different complexity that can be either precalculated or queried dynamically from a special database. The rest is about determining which tiles the user is seeing, downloading them from API and rendering.

This time I will skip tiling as it isn't necessary to see the result of the renderer. We will simply start with a dataset that is small enough to render it as a whole and make it zoomable and draggable (by the way, the name for such a thing is a slippy map).

Data preparation

I will download a bunch of buildings surrounding my university. I've used Get Lat+Lon to find out the coordinates of the center point and then calculated the bounding box of 200x200m.

(you can get degreesToMeters and metersToDegrees from Web Mercator projection article).

const center = degreesToMeters(50.068, 19.913);
const ne = metersToDegrees(center.x + 100, center.y + 100);
const sw = metersToDegrees(center.x - 100, center.y - 100);

Remember the bounding box format, it was: [bbox:south,west,north,east]. And we get it from:

const boundingBox = `${sw.lat},${sw.lng},${ne.lat},${ne.lng}`;

And now the query. To start with something simple to build upon, I will go with just the buildings for now. Try it in overpass turbo:

const query = `
  [out:json][bbox:${boundingBox}];
  (
    way[building];
    relation[building];
  );
  (._;>;);
  out;`;

Downloading

const cleared = query.replace(/\s+/g, "");
const command = `wget -O data.json http://overpass-api.de/api/interpreter?data="${cleared}"`;

I am skipping a bit of detail here, but hope you won't mind. Run the content of command either copy-pasted to the terminal or wrapped in some NodeJS script.

Conversion to GeoJSON

Convert it to GeoJSON for convenience in rendering using the osmtogeojson.

osmtogeojson data.json > data.geojson

Normalization

Before proceeding, I will convert and normalize the data a bit.

let minX = Infinity;
let minY = Infinity;

// Stores its points as two-element arrays for each polygon.
const coordinates = data.features
  .filter((f) => f.geometry.type === "Polygon")
  .map((f) =>
    f.geometry.coordinates[0].map((p) => {
      const { x, y } = degreesToMeters(p[1], p[0]);
      if (x < minX) minX = x;
      if (y < minY) minY = y;
      return [x, y];
    }),
  );

// Takes the previous array, 'normalizes' coordinates by subtracting 
// the minimum value on both axes and flattens each polygon's array. 
const offseted = coordinates.map((polygon) =>
  polygon.flatMap((p) => [p[0] - minX, p[1] - minY]),
);

Triangulation

Then I am triangulating the whole thing:

// Maps each polygon with an array of vertices from triangulating 
// it. 
const shapes = offseted.map((polygon) => {
  const node = linkedList(polygon);
  const triangles = earCut(node);
  const vertices = triangles.flatMap((i) => [
    polygon[2 * i],
    polygon[2 * i + 1],
  ]);
  return vertices;
});

// Join everything together for rendering (because it's just 
// triangles, 
// as long as we use the same set of shaders, it can be one big blob
// sent to the GPU).
const vertices = new Float32Array(shapes.flat());

Renderer

In case something goes wrong, I am for now showing the wireframes like in the Wireframes with barycentric coordinates. The whole renderer is basically the same, so I won't cover it here in depth.

Making it scrollable and draggable

I am using some variables to help with calculations. Width is the gl.canvas.width divided by the pixel ratio of the device. Height... you know. The zoom is changed by scrolling. The factor is there to steer how fast the zoom works. The offset is for storing the mouse position on the screen.

Setup

let width = 0;
let height = 0;
let zoom = 0;
let offset = { x: 0, y: -500 };
const factor = 5;

Scroll event

When the user scrolls, a bit of magic happens. First of all, zoom is updated by a constant rate (the constant is there to normalize it across different browsers; you never know).

Then happens an update of offset. It is necessary to make the scroll look like expected, i.e. when we are scrolling over some point, that point stays under the cursor.

The key here is to notice that when we are scaling the world after zooming, the points between the mouse cursor and the screen edge are also scaled. We can deduce that the solution is to offset the world back. Exactly by the difference between the size in pixels of the current mouse position and the unscaled one. In this case: o = o - m(1.01^z' - 1.01^z).

function handleScroll(event) {
  const sign = event.deltaY >= 0 ? 1 : -1;
  const previous = Math.pow(1.01, zoom);
  zoom += sign * factor;
  const delta = Math.pow(1.01, zoom) - previous;
  offset.x -= event.offsetX * delta;
  offset.y -= event.offsetY * delta;

  event.preventDefault();
  requestAnimationFrame(render);
}

Setting cursors

To make it look nicer, I am setting cursors so that it indicates to the user that the map can be dragged.

// Somewhere close to the top of the code
document.body.style.cursor = "grab";

// ...

function handleMouseDown() {
  document.body.style.cursor = "grabbing";
}

function handleMouseUp() {
  document.body.style.cursor = "grab";
}

Mouse move

When the user moves the mouse, we want to apply an offset. I am doing it only when the left mouse button is pressed.

NOTE
By the use of event.buttons we can entirely skip listening to mousedown and mouseup. The information is available for us every time.

The offset must be also scaled to keep the point by which user drags under the cursor. Without that, the user will intuitively notice something is wrong. Try it.

function handleMouseMove(event) {
  if (event.buttons !== 1) return; // If mouse is not pressed
  offset.x -= event.movementX * Math.pow(1.01, zoom);
  offset.y -= event.movementY * Math.pow(1.01, zoom);
  requestAnimationFrame(render);
}

Rerendering

Now a bit about why I've put requestAnimationFrame in two of the previous handlers. This handy callback tells the browser to execute given function before the next repainting of the screen. This way we are making sure that any change in the information important for us makes the browser refresh our map. And thanks to that, we don't have to constantly rerender the map just to heat up the user's GPU.

Keep in mind that each call to the requestAnimationFrame adds another request to run a function. This way we can run into a situation when while scrolling we will fire those faster than the browser can render them. But it's not a noticeable problem and I am leaving it for now.

I am also wrapping resize in a similar function:

function handleResize() {
  requestAnimationFrame(render);
}

Draw function

I am using a 4-dimensional matrix as in the 3D example. You can do as you want. I put the scale in the projection. The typical camera for me is about translating the view by the offset. You could of course write it as inverse(translation(offset.x, offset.y, 0)), but here I am saving CPU cycles for the better times. Then, because the whole scene is flipped on the Y axis, I am scaling that one by -1 and moving all triangles a screen down.

const pX = width * Math.pow(1.01, zoom);
const pY = height * Math.pow(1.01, zoom);
const p = projection(pX, pY, 1);
const v = translation(-offset.x, -offset.y, 0);
const m = multiply(scaling(1, -1, 1), translation(0, -height, 0));
const matrix = multiply(p, v, m);

Event handlers

The last missing part is adding the event listeners. Just a bunch of them.

window.addEventListener("resize", handleResize, false);
window.addEventListener("mousewheel", handleScroll, false);
window.addEventListener("mousedown", handleMouseDown, false);
window.addEventListener("mouseup", handleMouseUp, false);
window.addEventListener("mousemove", handleMouseMove, false);

NOTE
Because I am embedding my maps here, I am using the canvas HTML element instance instead of window in the listeners.

Result

CTRL or ⌘ + scroll to zoom. Drag to move around.

What's next?

You've entered an endless road to creating a perfect map service. The opportunities to sink more and more hours into it are endless. Here's what you can do:

Add roads and rivers

Add different types of objects rendered and fetched from the API. Roads, rivers, parks and forests are a way to go. You need to come up with a way to render them differently. You can go with separate shaders for each type, which gives the biggest flexibility if you want to add more sophisticated effects to them. Or you can go with a simple solution and sneak color information in the third and fourth dimension of points (that's a trick that graphic programmers love to do)

Simplify shapes

Have you noticed that some shapes are unnecessarily complicated while they are just some tiny background buildings? There are algorithms for simplifying shapes that could be used here.

Add tiling

This point is actually a bit of a trap. This involves two parts: having a client for displaying and some server that will provide those tiles. Client needs to figure out what tiles the user can currently see and fetch them from API. API has to find a way to serve them to the user. So it needs to either fetch them from some third party (but this would make it painfully slow), be limited to a map that fits into the memory or use a database (probably one with good support for geospatial data).

There are also other aspects that you would have to find a balance on:

binary data or trivial to load but heavier for transferring JSON
triangulate the server side (with cache) or make the client do it on the fly

Put labels on the map

One more thing that makes your map different from the Google Maps (maybe not the only one, but this one will soon be very noticeable) is the lack of text.

There's one huge hidden trap that came with taking the path of a WebGL renderer: there is no text here.

As we had to transform data into triangles ourselves, the same problem comes with text. Unfortunately this time, it will be much harder and broader topic.

If you want to get started with it, be advised that there are several approaches to the problem, but one that seems to work quite OK for maps is SDF text rendering (it is used for example by Mapbox).

Go 3D

Now that you've created a WebGL renderer, nothing stops you from going 3D. This involves fetching any useful data from OSM and using it to generate 3D shapes. You can find more in the article about Simple 3D buildings on the OSM wiki. There were already a lot of attempts (more or less successful) that you can take inspiration from.

osmtogeojson – converter between OSM JSON and GeoJSON.

Map label placement in Mapbox GL – more on the topic of labels.