TTF File Parsing

You've probably noticed that WebGL is really low level and it doesn't even support rendering text. It is mostly triangles with textures all the way down.

Text rendering has a vast amount of trade-offs that have to be made in order to get your string to the screen (mainly: the amount of characters in your alphabet, is it possible to rotate it or scroll without losing quality, whether is it possible to generate different font sizes, how much text will the GPU take and so on).

So to avoid making hard decisions first, I will first show you how to parse a TTF file. Why would you want to do that? Depending on the rendering method you pick, later on, it is highly likely that you will need information about the size and spacing of each character (unless you go with the monospace font but let's keep the challenge real). Of course, the font file also contains everything that will ever be needed to render given font. There is a table containing glyphs as a combination of several bezier curves, but I will not go into that since that would be a much deeper dive.

Prepare for some terse writing and bigger than usual blocks of code. I want to cover each line needed to go from having a TTF file on your hard drive to a JSON containing spacing information allowing us to render beautiful text.

What is a TTF file?

TTF is a binary file format meant to be as lightweight as possible. This means that it comes as a sequence of numbers that are just values and you know which value you are looking at by knowing their order.

Parsing TTF in NodeJS

I will follow with a step-by-step diary of how I managed to read a TTF file (specifically: font spacing information) in a bunch of NodeJS scripts.

Reading from binary file

Inspired by this article about parsing TTF files in JavaScript, I got the parser below to work. It can navigate around the Buffer with the content of the *.ttf file and read some meaningful values.

Opening the file

Classic fs module, promisified a bit.

function readFile(fileName: string): Promise<Buffer> {
  return new Promise((resolve, reject) => {
    fs.readFile(fileName, (error, data) => {
      if (error) {
        reject(error);
        return;
      }
      resolve(data);
    });
  });
}

Then I am just reading the file and providing it to my custom binary module.

const buffer = await readFile(input);
const reader = binaryFile(buffer);

Which looks like this:

function binaryFile(buffer) {
  const data = new Uint8Array(buffer);
  let position = 0;
  const getUint8 = () => data[position++];
  const getUint16 = () => ((getUint8() << 8) | getUint8()) >>> 0;
  const getUint32 = () => getInt32() >>> 0;
  const getInt16 = () => {
    let number = getUint16();
    if (number & 0x8000) number -= 1 << 16;
    return number;
  };
  const getInt32 = () =>
    (getUint8() << 24) | (getUint8() << 16) | (getUint8() << 8) | getUint8();
  const getFWord = getInt16;
  const getUFWord = getUint16;
  const getOffset16 = getUint16;
  const getOffset32 = getUint32;
  const getF2Dot14 = () => getInt16() / (1 << 14);
  const getFixed = () => getInt32() / (1 << 16);
  const getString = (length) => {
    let string = "";
    for (let i = 0; i < length; i++) {
      string += String.fromCharCode(getUint8());
    }
    return string;
  };
  const getDate = () => {
    const macTime = getUint32() * 0x100000000 + getUint32();
    const utcTime = macTime * 1000 + Date.UTC(1904, 1, 1);
    return new Date(utcTime);
  };
  const getPosition = () => position;
  const setPosition = (targetPosition) => (position = targetPosition);
  return {
    getUint8,
    getUint16,
    getUint32,
    getInt16,
    getInt32,
    getFWord,
    getUFWord,
    getOffset16,
    getOffset32,
    getF2Dot14,
    getFixed,
    getString,
    getDate,
    getPosition,
    setPosition,
  };
}

Tables

The TrueType font files are structured in tables. The file starts with a master table defining how many tables are there and when do they start and the rest of the file is those tables following strictly specified format.

Microsoft's specification of OpenType format was a priceless help to me when I was finding out what meant what.

As we can read in the section about file organization, the file starts with 5 numbers. The one we need is numTables which tells us how many tables we can look for.

reader.getUint32(); // scalarType
const numTables = reader.getUint16();
reader.getUint16(); // searchRange
reader.getUint16(); // entrySelector
reader.getUint16(); // rangeShift

Note that I am 'reading' all numbers to put the cursor in the correct place. After the initial header, we can read information about each table in the file.

const tables = {};
for (let i = 0; i < numTables; i++) {
  const tag = reader.getString(4);
  tables[tag] = {
    checksum: reader.getUint32(),
    offset: reader.getUint32(),
    length: reader.getUint32(),
  };
}

head

The first table that will tell us a lot about the rest of the file is called head. Reading it is pretty straightforward as we just need to follow the docs.

const head = {
  majorVersion: reader.getUint16(),
  minorVersion: reader.getUint16(),
  fontRevision: reader.getFixed(),
  checksumAdjustment: reader.getUint32(),
  magicNumber: reader.getUint32(),
  flags: reader.getUint16(),
  unitsPerEm: reader.getUint16(),
  created: reader.getDate(),
  modified: reader.getDate(),
  xMin: reader.getFWord(),
  yMin: reader.getFWord(),
  xMax: reader.getFWord(),
  yMax: reader.getFWord(),
  macStyle: reader.getUint16(),
  lowestRecPPEM: reader.getUint16(),
  fontDirectionHint: reader.getInt16(),
  indexToLocFormat: reader.getInt16(),
  glyphDataFormat: reader.getInt16(),
};

We will especially need unitsPerEm to convert FUnits (font units, a measuring system used in TTF) to pixels on the screen.

Then indexToLocFormat will tell us which format to use when reading information about glyphs.

Figuring out what we need

Going around the spec, we can find out several things about the tables:

maxp will tell us how many glyphs there are in the file
cmap tells us the mapping between character codes and glyph indices used throught the font file
glyf provides xMin, yMin, xMax, yMax
loca knows offsets of glyphs in the glyf table
hmtx contains information about leftSideBearing (which is how far each character wants to be from the previous one) and advanceWidth which is how much horizontal space it claims for itself
hhea will tell us how many horizontal metrics there are defined in the hmtx table (it doesn't have to be one for each character)

How each table depends on the others will partially force the order of parsing.

maxp

This one will be another easy to parse as it is just a sequence of vars.

const maxp = {
  version: reader.getFixed(),
  numGlyphs: reader.getUint16(),
  maxPoints: reader.getUint16(),
  maxContours: reader.getUint16(),
  maxCompositePoints: reader.getUint16(),
  maxCompositeContours: reader.getUint16(),
  maxZones: reader.getUint16(),
  maxTwilightPoints: reader.getUint16(),
  maxStorage: reader.getUint16(),
  maxFunctionDefs: reader.getUint16(),
  maxInstructionDefs: reader.getUint16(),
  maxStackElements: reader.getUint16(),
  maxSizeOfInstructions: reader.getUint16(),
  maxComponentElements: reader.getUint16(),
  maxComponentDepth: reader.getUint16(),
};

hhea

This one is necessary just for the numOfLongHorMetrics value.

const hhea = {
  version: reader.getFixed(),
  ascent: reader.getFWord(),
  descent: reader.getFWord(),
  lineGap: reader.getFWord(),
  advanceWidthMax: reader.getUFWord(),
  minLeftSideBearing: reader.getFWord(),
  minRightSideBearing: reader.getFWord(),
  xMaxExtent: reader.getFWord(),
  caretSlopeRise: reader.getInt16(),
  caretSlopeRun: reader.getInt16(),
  caretOffset: reader.getFWord(),
};

// Skip 4 reserved places.
reader.getInt16();
reader.getInt16();
reader.getInt16();
reader.getInt16();

hhea.metricDataFormat = reader.getInt16();
hhea.numOfLongHorMetrics = reader.getUint16();

hmtx

Thanks to the hhea, we now know how many hMetrics are there.

const hMetrics = [];
for (let i = 0; i < hhea.numOfLongHorMetrics; i++) {
  hMetrics.push({
    advanceWidth: reader.getUint16(),
    leftSideBearing: reader.getInt16(),
  });
}

const leftSideBearing = [];
for (let i = 0; i < maxp.numGlyphs - hhea.numOfLongHorMetrics; i++) {
  leftSideBearing.push(reader.getFWord());
}

const hmtx = {
  hMetrics,
  leftSideBearing,
};

loca

Translates index to location, so it basically tells us how much we should advance in glyf table.

The interesting thing we should note here is that loca comes in two flavours.

It is either short version, with Offset16 which is the actual offset value divided by 2 or the long version, with Offset32 which stores the real offset.

Both contain numGlyphs + 1 values because the first character is special 'missing char'.

const getter =
  head.indexToLocFormat === 0 ? reader.getOffset16 : reader.getOffset32;

const loca = [];
for (let i = 0; i < numGlyphs + 1; i++) {
  loca.push(getter());
}

glyf

Now we can use the loca table along with the indexToLocFormat to read glyphs.

const glyf = [];

for (let i = 0; i < loca.length - 1; i++) {
  const multiplier = head.indexToLocFormat === 0 ? 2 : 1;
  const locaOffset = loca[i] * multiplier;

  reader.setPosition(offset + locaOffset);

  glyf.push({
    numberOfContours: reader.getInt16(),
    xMin: reader.getInt16(),
    yMin: reader.getInt16(),
    xMax: reader.getInt16(),
    yMax: reader.getInt16(),
  });
}

cmap

Cmap is probably the hardest one to read here. It contains information about mapping unicode char code -> glyph index which means that finally, we will be able to know which character is represented by what values.

const cmap = {
  version: reader.getUint16(),
  numTables: reader.getUint16(),
  encodingRecords: [],
  glyphIndexMap: {},
};

We will stick to the version 0.

if (cmap.version !== 0) {
  throw new Error(`cmap version should be 0 but is ${cmap.version}`);
}

The file starts with the definition of encodings.

for (let i = 0; i < cmap.numTables; i++) {
  cmap.encodingRecords.push({
    platformID: reader.getUint16(),
    encodingID: reader.getUint16(),
    offset: reader.getOffset32(),
  });
}

Now we can use that information to find out if it is something we want to parse. There are so many formats that it doesn't make sense to support them all, especially since even libraries specializing in that usually focus on one or two variants.

let selectedOffset = -1;
for (let i = 0; i < cmap.numTables; i++) {
  const { platformID, encodingID, offset } = cmap.encodingRecords[i];
  const isWindowsPlatform =
    platformID === 3 &&
    (encodingID === 0 || encodingID === 1 || encodingID === 10);

  const isUnicodePlatform =
    platformID === 0 &&
    (encodingID === 0 ||
      encodingID === 1 ||
      encodingID === 2 ||
      encodingID === 3 ||
      encodingID === 4);

  if (isWindowsPlatform || isUnicodePlatform) {
    selectedOffset = offset;
    break;
  }
}

if (selectedOffset === -1) {
  throw new Error(
    "The font doesn't contain any recognized platform and encoding.",
  );
}

So basically what we did here is figure out if the font contains format 4 definition and now we will abort if it doesn't.

const format = reader.getUint16();
if (format === 4) {
  cmap.glyphIndexMap = parseFormat4(reader).glyphIndexMap;
} else {
  throw new Error(`Unsupported format: ${format}. Required: 4.`);
}

Format 4

This one is a standard character-to-glyph-index mapping table for the Windows platform for fonts that support Unicode BMP characters, as Microsoft's spec says.

Let's just start parsing and later on I will describe what does what.

const readFormat4 = buffer => {

Starting with a function.

const format4 = {
  format: 4,
  length: reader.getUint16(),
  language: reader.getUint16(),
  segCountX2: reader.getUint16(),
  searchRange: reader.getUint16(),
  entrySelector: reader.getUint16(),
  rangeShift: reader.getUint16(),
  endCode: [],
  startCode: [],
  idDelta: [],
  idRangeOffset: [],
  glyphIndexMap: {}, // This one is my addition, contains final unicode->index mapping
};

For some reason, the segment count is stored doubled. That's why there's X2 appended to its name.

const segCount = format4.segCountX2 >> 1;

for (let i = 0; i < segCount; i++) {
  format4.endCode.push(reader.getUint16());
}

reader.getUint16(); // Reserved pad.

for (let i = 0; i < segCount; i++) {
  format4.startCode.push(reader.getUint16());
}

for (let i = 0; i < segCount; i++) {
  format4.idDelta.push(reader.getInt16());
}

const idRangeOffsetsStart = reader.getPosition();

for (let i = 0; i < segCount; i++) {
  format4.idRangeOffset.push(reader.getUint16());
}

Now that we've read all the information, the hard part comes.

So to introduce it a bit, cmap table is based on segments. Segments are contiguous ranges of character codes to allow the font to define only a subset of Unicode characters.

Each segment is described by startCode and endCode. It also has corresponding idDelta and idRangeOffset that are used for mapping characters to codes in the given segment.

The table was designed to help search inside it, so, for example, the last segment is a special one with 0xFFFF as both its start and end code. But we will not use that since we are just producing JS object mapping.

for (let i = 0; i < segCount - 1; i++) {
  let glyphIndex = 0
  const endCode = format4.endCode[i]
  const startCode = format4.startCode[i]
  const idDelta = format4.idDelta[i]
  const idRangeOffset = format4.idRangeOffset[i]

So for each segment, we are looking for endCode - startCode mappings.

NOTE
Follow this part of the spec: Format 4 segment mapping to delta values to get a detailed description of what is about to happen.

As the specification states, if idRangeOffset in the given segment was 0, our job becomes trivial and glyphIndex equals (c + idDelta) & 0xffff, where & 0xffff is simply modulo 65536 as the spec requires.

If idRangeOffset is not 0, the spec provides the following formula for calculating the places:

glyphId = *(idRangeOffset[i]/2
            + (c - startCode[i])
            + &idRangeOffset[i])

It roughly translates to our JavaScript code, with few exceptions. First – we need to multiply (c - startCode) by two since all values here are two bytes big.

  for (let c = startCode; c <= endCode; c++) {
    if (idRangeOffset !== 0) {
      const startCodeOffset = (c - startCode) * 2
      const currentRangeOffset = i * 2 // 2 because the numbers are 2 byte big.

      let glyphIndexOffset =
        idRangeOffsetsStart + // where all offsets started
        currentRangeOffset + // offset for the current range
        idRangeOffset + // offset between the id range table and the glyphIdArray[]
        startCodeOffset // gets us finally to the character

      reader.setPosition(glyphIndexOffset)
      glyphIndex = reader.getUint16()
      if (glyphIndex !== 0) {
        // & 0xffff is modulo 65536.
        glyphIndex = (glyphIndex + idDelta) & 0xffff
      }
    } else {
      glyphIndex = (c + idDelta) & 0xffff
    }
    format4.glyphIndexMap[c] = glyphIndex
  }
}

And finally returning the format4 for usage in cmap.

return format4;

Producing spacing information

Now that we have everything parsed, we can gather spacing information.

The code for that is trivial, maybe with one exception. The rightSideBearing is calculated by subtracting from advanceWidth but it shouldn't be hard to grasp either.

function spacing(ttf) {
  const alphabet =
    " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";

  const map = {};
  alphabet.split("").forEach((char) => {
    const index = ttf.glyphIndexMap[char.codePointAt(0) || 0] || 0;
    const glyf = ttf.glyf[index];
    const hmtx = ttf.hmtx.hMetrics[index];

    map[char] = {
      x: glyf.xMin,
      y: glyf.yMin,
      width: glyf.xMax - glyf.xMin,
      height: glyf.yMax - glyf.yMin,
      lsb: hmtx.leftSideBearing,
      rsb: hmtx.advanceWidth - hmtx.leftSideBearing - (glyf.xMax - glyf.xMin),
    };
  });
  return map;
}

For example, let's have a look at Q character.

"Q": {
  "x": 168,
  "y": -192,
  "width": 1808,
  "height": 2268,
  "lsb": 168,
  "rsb": 168
},

What you might find interesting is the range this unit takes. The value is quite high compared to pixels. So simply speaking, it is using a unit called FUnit (font unit) which encodes each glyph on a plane ranging on two-byte numbers. To translate it to pixels, we must know how many units per em are used in the font we are looking at. You can read it from head table.

The formula looks like this:

const scale = (1 / unitsPerEm) * fontSizeInPixels;

You can use them to scale those values by multiplying them by the scale factor.

Calculating shape of the text

To test the results of our work, we can render shapes of the characters in canvas. Calculating the rectangles containing glyphs looks as follows:

let positionX = 0;
let rectangles = [];
for (let i = 0; i < text.length; i++) {
  // Assuming spacing is an object containing information like 
  // presented above. 
  const { x, y, width, height, lsb, rsb } = spacing[text[i]];
  rectangles.push({
    x: positionX + (x + (i !== 0 ? lsb : 0)) * scale,
    y: 48 - (y + height) * scale,
    width: width * scale,
    height: height * scale,
  });
  positionX += ((i !== 0 ? lsb : 0) + width + rsb) * scale;
}

Note that I am skipping leftSideBearing when i is 0 (as I've noticed looking at the text rendered by canvas). And positionX advances by the written character as well as rightSideBearing.

Getting it to the screen:

rectangles.forEach((r) => context.fillRect(r.x, r.y, r.width, r.height));

Is it all?

Note that we have only touched a small subset of possible TTF tables. It is sufficient to render many fonts, for example Inter, but might be not enough for others.

We didn't touch kerning or GPOS (glyph positioning table) so some fonts might be calculated totally wrong. But you get the idea of how it works and you should be able to add anything you need with the help of Microsoft OTF spec.

Cool resources

Microsoft OpenType spec – great source of knowledge about data types used within the format.
Let's read a Truetype font file from scratch – great tutorial on reading TTF files that allowed me to get started.

TTF File Parsing

Stay up to date with a newsletter