Crisp is an unlimited chatbox for your website. Learn more about what we do.

How We Internationalized The Crisp Chatbox At Scale

Here at Crisp, we love technology. Especially when it is bleeding edge and may help improve the quality and speed of the service.

Recently, we made a tiny optimization to the chatbox code using a recent CSS feature, which had a tremendous impact at scale: it reduced the total size of the chatbox on the network by a factor of 5.

Flags

Internationalized Fonts Is The Bottleneck

The Crisp chatbox is served millions of time, on a daily basis. When a file is loaded millions of time, every bit you save can have a tremendous impact.

The bottleneck for Crisp is international support. Crisp is definitely an international service (the chatbox supports more than 30 languages from all over the world, and counting - we do accept translations from the community). The world speaks a lot of different alphabets, with even more variants for each alphabets (Simplified Chinese and Traditional Chinese).

A font file is made of what we call glyphs. A glyph is the computer representation of an alphabet character. That character is referenced by its unicode code: for instance, "A" gives "U+0041" in unicode (UTF-8).

Any letter you see on the Crisp chatbox, thus, was rendered from the font using its unicode character. This is the standard encoding set used on most websites today, as it can map to virtually all alphabets.

Crisp uses a beautiful custom font, called Noto Sans. Loading a custom font ensures the chatbox looks consistent on a variety of devices. Indeed, we cannot rely on built-in system fonts as they vary a lot, and may not be always available for use. Thus, every time a website visitor accesses a website using Crisp, his browser loads the Noto Sans font file from Crisp servers (if not in cache).

Before the optimization, the Noto Sans font file that was loaded weighed almost 200KB, both for the regular and bold variants. This sums up to a total of 400KB. That's pretty large for the Web, especially on slow connections.

The large size of the font is due to the fact it contains glyphs from alphabets from all over the world. However, it makes sense that a typical US/Europe user doesn't need the Japanese subset of the alphabet. So why bother loading all the glyphs when you only need 5% of them?

Now, an obvious thing to do is to split the font into parts, containing regional alphabets. That's just what we did; we explain how below.

Saving Bandwidth With Font Subsetting

Enter the world of unicode font subsetting.

A good practice when you serve a font to international users is to split it in subset, given an unicode range. In theory, it's rather simple: you define multiple unicode ranges, and you associate a font file per unicode range.

Then, the browser will dynamically load only the font it needs on runtime, when it needs it. This means that if the Crisp chatbox is loaded in French, only the Latin subset will be fetched. If, however, a few seconds later the user sends a Russian character; the Cyrillic font subset will be asynchronously loaded upon message display. It's neat and it saves a lot of bandwidth.

At Crisp, on a basic Latin-1 subset, we managed to reduce the Noto Sans regular and bold font weights from 200KB to 10KB each! This is a 20x improvement!

Enough said: how can you implement this in your code?

How To Subset Fonts In Your CSS?

Quite simple: first of all, convert your font to the subsets you need to serve. Grab a list of subsets from this page: Unicode Character Ranges.

No need to map all subsets as it's a tedious task. Pick only the ones that are relevant for your use case (eg: you serve only Russian an European users, pick only all the Latin and Cyrillic ranges).

Make a list of those, then go to the Font Squirrel Webfont Generator. This tool helps you easily convert your source font into the font format of your choice, and also to subset it.

Prepare your font settings:

  1. Upload your font files
  2. Uncheck all form options to disable all transforms (eg: keep existing TrueType hinting, do not adjust vertical metrics, etc)
  3. Select the custom subsetting option, and make sure all subsetting checkboxes are unchecked

Then, for each subset you have, repeat the following process:

  1. In the Unicode Ranges text input, append the current range (eg: 0020-007F for Basic Latin)
  2. Submit the form for processing. Wait a few seconds (can be slower), and you'll get an archive of the transformed font(s)

Once you're done, pick the font files and organize them well in sub-folders, given their range (the sub-folder option may be useful if you're used to using multiple font formats, as WOFF and WOFF2 - which we recommend).

Now, in your CSS, update your @font-face definitions so that they look like the following (in Crisp case):

@font-face {
  font-family: "Noto Sans Regular";
  src: url('/static/fonts/noto_sans/0020-007F/noto_sans_regular.woff2') format('woff2'), url('/static/fonts/noto_sans/0020-007F/noto_sans_regular.woff') format('woff');
  unicode-range: U+0020-007F;
}

@font-face {
  font-family: "Noto Sans Regular";
  src: url('/static/fonts/noto_sans/00A0-00FF/noto_sans_regular.woff2') format('woff2'), url('/static/fonts/noto_sans/00A0-00FF/noto_sans_regular.woff') format('woff');
  unicode-range: U+00A0-00FF;
}

Notice that you define multiple @font-face for the same font-family, with different unicode-range values.

The unicode-range must be filled with a value starting with U+, and containing either a single character code (eg: 00A0), or a range: OOAO-OOFF;

You can repeat the @font-face definitions to as many font subsets you have (at Crisp, we have 122 subsets!).

What About Older Browsers?

We didn't tell you: the unicode-range CSS property is fully supported in recent versions of major browsers (Firefox and Chrome), while it is only partially supported in Edge 11 and Safari 9.3. However, the upcoming Safari 10 will bring full support for it.

When not supported, the unicode-range property is tolerated, yet silently ignored. This means that while it won't break anything in older browsers, it will definitely add a network overhead, as the browser will load all font subsets at once - even if most are never used.

Supporting a wide range of languages, the Crisp chatbox just could not let legacy browser load all 122 fonts on load - handling all those little files makes everything slower than ever! This is why we selectively use a legacy CSS for those pesky browsers, that only includes the full Noto Sans font (remember: the 200KB font). This way we degrade older browser performance. Hopefully, if you don't have as many font subsets as Crisp, you won't even need to proceed such fallback (loading eg: 10 font files at once is tolerable, since those are small - loading 122 font files is definitely not).

Following our research, here's the list of the unsupported browsers and versions up to which unicode-range is ignored:

  • Internet Explorer (all versions)
  • Edge (all versions - as of summer 2016)
  • Chrome (less or equal than 29)
  • Firefox (less or equal than 46 - no record, so we set it to previous version)
  • Safari macOS / iOS (less or equal than 9.3)
  • Opera Mini (all versions)

Hopefully, this new shiny CSS technique will make the Web a little bit faster for users and lighter for network infrastructure.

Do not miss our next post.

Join thousands of people on our newsletter. No spam, unsubscribe anytime.

Valerian Saliou's Picture

Author: Valerian Saliou

Valerian is the co-founder and CTO of Crisp. Perfectionist by nature, he hates status-quo. He wants to master every subject he encounters, his current one being customer interaction.