Top tips for Translating Open Education Resources

Posted by s.aragon on 26 May 2022 - 11:11am
Dictionary focusing on "technology" word
Photo by Joshua Hoehne

By David Pérez-Suárez – University College London.

Many educational resources have benefitted from contributions by volunteers to them over the years. One brilliant example of that is The Carpentries Curricula. However, in The Carpentries case, all that great material was created in English and as the community grew, many of us felt that such a wonderful learning material could help many of our peers that were not confident in English. But, how do you start translating a learning resource? 

These top tips are based on experiences within The Carpentries, and what has been learnt from discussions with translation volunteers from other communities and professionals. There are, however, many other communities that may have gone through a similar effort to translate Open Educational Resources (e.g., Wikipedia and the Wikimedia family, Programming HistoriansFreeCodeCamp, Khan Academy, TED and many more). Their experiences may be different due to the changing or fixed nature of their materials, the medium used (text, videos, images, ...) and their community interests.

Please note that this guide is aimed at people that want to volunteer to do translations for open resources – with a permissive license – not available in a particular language. If you need to translate something official, then you should hire a professional translator.

1. Type of "translation"

This may be conditioned by the material that you want to translate. Is it a resource that is not going to change in the future (e.g., a novel) or is it a material that's on continuous evolution, such as software documentation? The changing nature of the resource may determine the type of tools that helps you in the process, and the effort you need to put into it.

When the source is not fixed ("live" resources), you need to decide which type of translation you want to follow, a 1-to-1 translation (as when reading a translated book) or a free-form translation (as when visiting Wikipedia pages on different languages). Normally, on Open Educational Resources (OER), one tries to keep a 1-to-1 match to take the most from all the work put into the source. However, many times, a 1-to-1 translation doesn't make sense. For example, a lesson may be based on some cultural knowledge such as the different types of rain, but that would make no sense to learners in a dry country. This is the difference between internationalisation (abbreviated as i18n) and localisation (l10n). The first could be seen as a first step of translating the content "as it is", whereas the latter is adapting it to a particular audience.

2. Define the rules of the game

If you've ever learnt a foreign language, you've probably discovered early on in your learning journey that languages don't have a one-to-one translation. Sometimes there's not even a way that the meaning can be properly translated (as with types of rain). However, there's much more than words when communicating. There are other aspects to how sentences are composed that makes the final result look more or less formal, approachable, inclusive, or even acceptable. Defining earlier the style of your translations will help the community to achieve a more consistent (and hopefully inclusive) result. You don't need to come up with a full style guide from the beginning to start playing. As language is a living entity and in continuous evolution, the rules of the game will need to adapt. However, you can start with the basics needed to your aimed translation. This varies between languages, but aiming for simplicity, friendly voice (i.e., not formal) and inclusivity is a good start. 

Besides the style of the language, it's a good idea to create a glossary of common terms that appear in the source. Additionally, keep a list of terms that should be left untranslated, such as, for example, the name of institutions or communities, or technical terms that may not have an accepted translation. One of the main goals of these rules is to keep consistency across the work, without them, it will be harder to do when working with others.

Though this may look a huge task, start small and don't forget to look around what rules other communities have created (Mozilla, LibreOffice, Gnome, KDE).

3. Set a translation workflow

Many open-source projects have the following workflow to accept contributions: first, a random person proposes some changes to a program; second, the changes are reviewed by the community and tested they don’t “break” anything; third, the review may have either approved the changes or proposed some modifications; fourth, a maintainer accepts the contributions and these are merged to the piece of software. The translation workflow follows a very similar system, but with slightly different names. The contributors are known as translators, and the reviewers are either called reviewers or proofreaders. There is also tooling to test part of the style (mostly to check that words that are in a glossary are translated accordingly). In small teams, however, you may not want to follow this workflow or roles strictly and allow everyone to approve other's contribution. That's OK too! The example workflow in Pontoon's documentation – Mozilla's localisation platform – describes how that process would work. In any case, it will probably take a couple of iterations to get a translated piece of text to sound right.

Another part of the workflow is whether to follow a "release" schedule. When translating software, this becomes more important, as you would like to make your program available for everyone as soon as a new version appears. Since most of the larger software projects have a defined release schedule for their products, the translators can plan when they will work on these projects. Though educational resources rarely require a fixed release schedule in multiple languages at once, it's a good plan – mostly if the source is always evolving – to set some translating schedule. This will require to "version" the source in some way, on textbooks we normally refer to these versions as "editions". Using commit numbers or tags as provided by version control software – as we do to enable reproducibility in our software projects – is a good approach to refer to the time that the translation is based on.

4. Find the right tool for the work

There are a variety of  tools and formats available to help with translations. Most of them work with the same principle: the source text is broken into units to be translated, then each unit is translated and reviewed, and, finally, the translations are merged with the same structure of the source. A common tool for doing the first and last step is gettext (there are also multiple tools that translate formats – like markdown, LaTeX, reStructuredText – into portable objects, gettext format). There are multiple online platforms that enable real-time collaboration on the translations that provide the tooling to manage the translation workflow. Some of these online tools are open source like Weblate, pontoon, and Pootle that you can host yourself, whereas others such as Crowdin or Transifex are free for open source projects. Of course, not everything needs to be done in real-time! There are also excellent tools for working "off-line" on your desktop. Omegat and Poedit are available for the main three operating systems,  and KDE's Lokalize and GNOME Translation Editor are good options on the Linux desktop.

No matter the tool, one thing to keep in mind, is how they work. For example, many of the localization tools were developed to easily provide translations to software user interfaces. It's common on these to translate lots of individual words that form part of the graphical menus or short messages that are shown to the users. This is different from translating documentation or tutorials, where you need to read a whole paragraph or section to understand the context. In that respect, be aware how the conversion tool you are using is "chunking" the source text. For example, new lines in the source text of on markup languages – e.g., HTML or LaTeX – don't carry any meaning in the output. These, however, can be picked by the conversion tools as a good place to split the text, making it harder for the translator to understand the context.

If you need to translate a document that's not in a format understood by gettext, such as word documents, then try to open it via OmegaT. It will do the conversion and give you a nice interface to do the translation. If you need to collaborate with others, OmegaT can work with git repositories!

5. Community

Unless you are a professional translator, in which case probably you don't need these tips, you would be translating as a hobby and with a limited amount of time. Therefore, look up other translators! There are a lot of people out there happy to help on your translation project, so make calls on social media, through your network and on other translation groups – such as the French traduc.  As with managing communities around open source projects, you will need to take care of your translators friends. For example: have a place where to discuss and coordinate efforts (a simple mailing list or chat room can do!), organise translate@thons to work all at the same time (they can be online and covering multiple time-zones); train your community to use the tools hosting on-boarding sessions, and discuss the style guide from time-to-time to see that still applies.

Takeaway message

Doing translations is a fun game – and doing it with a group of people is more fun. Your brain may hurt at times, but in a good way! Translations can always be done better, but don't let that aim to perfection stop the distribution of your result. With the tools available, a style guide and with the help of your community, you will manage to get a result good enough for consumption. Aiming for consistency will make the text more readable and the flaws won't be too noticeable. 


This guide hasn't covered some other interesting topics:

  • translation of non-text media (images, videos and audio resources),
  • the role of automated generated translations, and
  • in the case of "live" resources, how to propagate contributions to the source coming from the translated material.

Maybe they will be covered in future guides! Keep an eye on the blog.

Want to discuss this post with us? Send us an email or contact us on Twitter @SoftwareSaved.  

Share this page