Skip to content

ADR 14: Internationalization

Context

In order to provide study materials to all of our students, we need to translate them into the form, which is understandable to all of them. I18n can take advantage of virtualized FileSystem in order to filter input files.

Decision

Instead of already existing gettext python module we’ve decided to prepare and implement our custom solution baked by our needs. Basically, there are 4 levels of translations within it4kt-builder:

  1. course files level where every file in content directory can be translated into multiple language`s.
  2. course configuration level where translations for course unique strings are located e.g. course title, additional links names and their urls and so on.
  3. builder level with consist of translations for common words used all across the builder.
  4. theme level, because every theme can have it’s own specific words.

Main translation setting is read from course configuration and it looks like:

translations:
  languages: [sk, en, ru]
  fallback: sk

From maintainers and users consulations we defined following goals:

  • Support multiple language.
  • Support fallback language for cases where missing translation for file should be filled by provided fallback language.
  • Support edge cases, where some files doesn’t need to be translated and should be skipped during conversion for specific language.

To achieve main goals we’ve decided to use following flow:

  • We are setting first language from list of available languages as our default language. Default language means:
  • Default language file has no language suffix (for example index.md instead of index.{LANG}.md).
  • All files that doesn’t have its translation suffixed file available are going to be resolved using default language file if it exists. If they don’t, i18n will skip them completely.
  • URL address for content in default language doesn’t have any suffix and are located in root of course output/ directory.
  • User is free to set fallback to any language from list of available languages.
  • when fallback is set:
  • If file with currently set language suffix does not exist, i18n manager will try to find file with fallback language suffix.
  • If file with fallback language does not exist, i18n manager will try to use default language file.
  • when fallback is not set:
  • We are searching for file with no suffix. If it doesn’t exist we exclude this file from conversion.

During file localization, the file translation flow can be simply described using following priority order:

  1. Search for file with suffix of currently parsed language (with en language set we are searching for index.en.md).
  2. Search for fallback language file if set (with sk fallback language we are searching for index.sk.md).
  3. Search for default language file (searching for file without any suffix e.g. index.md).
  4. If no file has been matched until now, skip file.

These rules means that user has the ability to exclude files from conversion completely or force fallback file to take a place in conversion instead.

We’ve also decided that every file in conversion process is able to be translated including images, in-app video and audio, etc.

We’ve decided that this process is going to take place in file system virtualization. This means that virtualized file system tree contains only paths, which passed through localization algorithm.

So far we were talking about file translations but to be complete, following lines are dedicated to strings translations. I18n manager contains two methods for getting string translated based on string’s origin. In course configuration file we decided to define translations as it is shown below:

course:
  title:
    sk: Názov predmetu
    en: Course Name
    ru: Название предмета

folders:
  - path: lectures
    title:
      sk: Prednášky
      en: Lectures
      ru: Лекции

On the other hand, there are builder and theme level translations which are defined by language identificator, key identificator and translated string. Translation file for builder namespace looks like:

sk:
  title: Názov
  author: Autor
  week: Týždeň
  publicationWeek: Publikačný týždeň
en:
  title: Title
  author: Author
  week: Week
  publicationWeek: Publication Week
ru:
  title: Название
  author: Автор
  week: Неделя
  publicationWeek: Неделя публикации

We’ve decided for different formatting of translations file between course configuration and builder or theme translation to keep course configuration file as simple as possible.

Status

Accepted

Consequences

Internationalization is implemented in a way that it is trying to fit as many edge cases as possible and yet still be quite easy to reason about. Every user has a possibility to choose which of content files should be translated, which should fallback or use default language instead.

The use of global variable introduces an implicit dependency that can be hard to trace.