How Next.js I18n Works at scale | scale

This is one of those posts that started out as something completely different, then switched objectives several times, and in the end lead me to challenge my views on the whole topic. Initially, I wanted to give an overview of my two preferred JavaScript i18n (internationalization) libraries and why I chose one over the other for the scale.at website. Halfway into my research, I realized that the advantages I wanted to highlight were not as pronounced as I thought and partly based on false assumptions. Consequently, this post is about the learnings made while writing the (never published) last post, while giving an overview of how we work with i18n and static site generation (SSG) at scale.

Requirements

Naturally, scale’s own website should give a good example of our quality of work, including priorities like performance and accessibility (it might be due to our teaching and consulting background, but often when we want to learn more about a company, we check their website source code 😅). For a content-focused website including a (small scale) blog, static site generation is our technology of choice. Also, due to our international clientele, we wanted the marketing part of our website to be available in German and English (having a multi-language blog is a noble endeavor, but for now we only write in English due to the larger audience). So in theory, the way to go is quite simple:

create some HTML template files
compile them once for each language
deploy 🎉

Static site generation

However, that’s not what we did. In an alternate universe, scale.at might use a dead-simple solution like 11ty, but in this universe, we live and breathe React, so we wanted to use React component-based templating. There are currently two common projects that support this:

Actually, neither of those two are static site generators, but general application frameworks offering static export as a feature (foreshadowing: this is where a lot of problems will come from 🌩️).

I18n

The two frameworks both do not come with any sort of string translation out-of-the-box.

Personal opinion: My experience with many sorts of libraries for the web is that translation is at best mentioned as an advanced feature in the docs, or is an afterthought for which (sometimes half-baked) third party solutions may or may not exist. I’m not wishing for opinionated i18n journeys in all products, I just noticed that having multi-language content on the web is less widespread than it should be from a UX and accessibility perspective 🙄.

This basically means that any i18n library can be used, but I wanted to choose one that supports React and that I’ve already worked with, which gave me two options:

react-intl

This is a more extensive project, also offering things like date and number formatting. It comes with tooling to extract translations and replace them in code, which enables advanced features like adding comments for professional translators and automatically generating translation IDs. This sounds a bit abstract, so let’s see it in action:

intl.formatMessage({
  defaultMessage: "Hello World",
  description: "Greet the user when visiting the app"
})

Through the magic of Babel, the above gets replaced at build time by this:

l.formatMessage({id:"ys8N9Y"})

The extracted translation catalogs can be sent to a translation agency and compiled back into the app when they’re ready. One drawback of this approach is that the source language has to be edited right in the source code, which means that either the developer has to do UX writing or the UX writer has to develop (at least a little bit).

react-i18next

This is the React library for i18next, which in itself is a large ecosystem of integrations, but with somewhat less integrated tooling. There is i18next-parser, which can extract translations from code, but I haven’t found a suitable alternative to react-intl’s build-time code replacement feature. This isn’t too bad for smaller projects, as long as you like reading things like this:

i18n.t("greeting.helloWorld")

Interestingly, react-intl highly recommends against using standalone IDs like this on account of consistency.

This way of defining translations does not offer an option to add translator comments, but adding keys by hand also has its advantages. For instance, it’s easier to avoid duplicates and use the same translation at multiple places in the project. Also, in contrast to react-intl, it allows for namespaces, which makes it possible to use separate translation files for different parts of the project, thus facilitating code-splitting; for example, "index:greeting.helloWorld", would define a dedicated namespace for the index page.

Choice of technology

For about one and a half years now (at the time of writing), the scale website has been using Next.js and react-i18next in one form or another. There are currently “only” about 90 translated strings across 20 files, which makes this website comparatively small and the translations easy to keep track of. Moreover, we are in the fortunate position to be able to translate all our languages ourselves so we don’t need extensive translation management tooling.

Over the last few projects, it became clear that some sort of translation tooling like automated extraction is never a bad idea. We’ve always tidily kept track of added and removed strings, but once we added tooling, we were surprised about some of the inconsistencies it detected.

However, while working on another project, I came to appreciate the development experience of react-intl—writing source language strings directly in code increased development speed considerably while reducing context switching between writing code and maintaining translations in a separate app. This learning (and apparently too much time on my hands) inspired me to try and implement this workflow in our own website.

React Hydration

While fiddling around with Babel settings and trying to roll my own namespace-based code splitting, I made an ugly discovery: 15.5% of the resulting HTML file consisted of a script tag containing a JavaScript representation of all translation strings. Translation strings that were in the HTML already anyway. Like, a second time 🤦‍♂️.

You may go ahead and say ”Duplicate strings, you say? That sounds like it could be compressed fantastically!” and you would be right, but the markup and JavaScript versions are different enough to still result in a 9.5% overhead when using Brotli compression.

After some research, I found out that this is due to the way Next.js and similar solutions work. When statically exporting a website from Next.js, content is pre-rendered to HTML, which React hydrates at runtime, which means that it takes the static markup and makes it interactive again by matching pre-rendered HTML elements to its computed virtual DOM. Simply put, React expects the static markup to be exactly reproducible at runtime, and for this to work, React needs all dynamic data (i.e. translated strings) to be present in JavaScript land.

After browsing several issues on GitHub and posts on StackOverflow, I found some other developers that were as concerned about this as me, but for most purposes it’s actually really not that bad. Adding another 3kb to my 7.8kb HTML file is a lot relatively speaking, but on the other hand, the font we use for headlines is 29kb 🤷‍♂️.

Going even more static

Having a 15.5% larger HTML file (and possibly some JavaScript parser overhead) would be okay on most websites, but for blog post pages, the hydration data was almost comically large (around 33% of the HTML file) due to the fact that our posts are generated from markdown files at build time; Next.js goes as far as putting the whole dynamically rendered HTML a second time in the HTML as JavaScript.

A meme featuring Xibit laughing, captioned: I heard you like HTML so I put HTML in your HTML so you can parse while you parse

De-Nexting Next.js

Needless to say, this discovery lead me to question the fit of my setup to our use-case. Next.js is an amazing way to build a website or app. Even when using the static export, it still offers handy features like runtime React functionality and link preloading. However, scale.at does not even need to run JavaScript. The geometric animations on scroll will be dearly missed and the contact form will navigate to a success page instead of showing a dialog on submit, but apart from that, no JavaScript is used on the whole website. So why bother running a full React app? Enter de-Next-ification (don’t cite me on that):

export const config = {
  unstable_runtimeJS: false,
}

With this small code snippet in every page component, Next.js will completely disable all framework JavaScript at runtime. The only thing we have to forgo is the link preload, which, when repeatedly navigating back and forth between pages, made the website feel really snappy. However, with this small change, I was able to cut the website payload to a third of its initial size and, to be honest, I don’t notice a big difference between Next.js’s link preloading and navigating between HTML pages the good old way.

Adding back some JavaScript

For triggering animations on scroll as well as the dynamic form behavior, I simply added script tags containing non-minified VanillaJS to the runtime code. It may be a bit ugly, but it does the job:

<script
  dangerouslySetInnerHTML={{
    __html: `
const elements = document.getElementsByClassName("animate-me");
…
`
  }}
/>

Conclusion

From a development perspective this arguably looks like putting JavaScript into JavaScript which then gets compiled by JavaScript (no meme this time, sorry). But the result—the actual website content that gets delivered to the user—only contains the bare minimum of JavaScript.

I admit that this might not be the most beautiful setup of all time, and I would lie if I’d say I haven’t thought about porting everything to a nice lean 11ty setup 🤤. However, for now, the current setup works flawlessly, is flexible, and fun to work with, so unless I unexpectedly get a lot of spare time on my hands, I’m going to leave it as-is for the time being.

My general takeaway regarding i18n is that choice of setup heavily depends on the use-case and that it’s often hard to get right, even on statically generated sites. However, from a frontend perspective, choice of i18n setup matters less when opting for de-Nexted Next.js (or any truly static generator), as long as it gets the translated strings right at build time.

In any case, the i18n workflow should not be underestimated and should ideally be supported with adequate tooling and infrastructure. I’d recommend using a basic i18n setup from the beginning even for single-language projects, since it’s never a bad idea to separate language strings from code (lest you end up with an unmaintainable jumble of random strings 🍝 (or you decide to add a second language later 😱)).

We’ve already used quite some interesting i18n approaches in our projects, so stay tuned to see some of them in our future Blogfolio posts 🤩.