Skip the blurb: How do I add new language support?
An outstanding issue for a long time has been internationalising messages from kplex. The i18n branch now pushed to github is the start of an effort to make kplex more friendly for those who prefer not to use english. For this I need some help.
What is being "internationalised"?
There are 3 things which would make kplex more friendly to those who prefer using languages other than English:
- Documentation in other languages
- Messages in other languages
- Configuration options in other languages
There is an unfortunate cross-dependency between configuraiton options and documentation. Using google translate on the documentation will result in users seeing some but not all options being translated but not necesarily in the most appropriate way. On the other hand if users prefer to read the documentation in English, making configuration options language-dependent will result the documentation not matching what localised kplex recognises.
Consequently what is being addressed for now is purely the messages produced by kplex. Its configuration options remain in english. Contributions of documentation in other languages are always gratefully received but will fall out of date if not maintained.
How is this being done?The standard approach to internationalising messages is to replace literal strings in the text of a program with a call to a function which renders a message in the language determined from the running program's environment if one is available or a default message (often in english or the prorgammer's native language) if a message appropriate for the locale is not available.
The function most commonly used for this is called "gettext()". However this is not part of the POSIX specification and whilst it is widely available it is not guaranteed to be installed everywhere kplex might be compiled. Notably gettext() is not a standard part of OS X.
The function used by kplex for internationalisation is catgets() which is considered by many to be clunkier than gettext() but has the advantage of being available on all POSIX compliant systems (including OS X).
How does catgets() work?
The gnu documentation for the catgets suite of functions is here: https://www.gnu.org/software/libc/manual/html_node/Message-catalogs-a-la-X_002fOpen.html#Message-catalogs-a-la-X_002fOpen
On startup kplex attempts to open a message catalogue appropriate for the computer's locale, as determined from the LANG or LC_MESSAGES environment variables. This is located in /usr/share/kplex/locale/%l/kplex.cat where "%l" is the language element from the LANG environment variable (e.g. "en", "de", "fr", etc.).
A message catalogue is built from a source file using the gencat program. The source file groups messages used in kplex into "sets", one set for each source file. Each message from each source file is assigned a numeric identifier in the relevant set. The catgets() function looks for a message with a given id and set in the message catalogue and prints that. If no message with the correct set and id is found in the catalogue, a default message is printed.
Initially only an english message catalogue has been created. In isolation this is pointless: the default messages are in english and the message catalogue contains only a copy of these. However the purpose of this exercise is to create a template for others to translate.
The first few lines of the msg/en source file looks like this:
$quote " $set 1 error.c 1 ": Unknown Error" $set 2 kplex.c 1 "Bad debug level \"%s\": Must be 0-9\n" 2 "failed to allocate memory" 3 "Usage: %s [-V] | [ -d
] [ -p ] [ -f ] [-o
The first line states that '"' is a quote. Explicit occurances of this must be escaped
The second line states that the following lines (until the next $set directive) are to be considered part of set "1". "error.c" is a comment noting that this set refers to the messages defined in the error.c source file.
The third line is message number 1 in set 1.
The fourth line starts a new set for the kplex.c, which contains far, far more messgaes than the 4 reproduced here.
What languages do kplex messages exist in?
Initially only english. If people would like to add other languages please check out the i18n branch and submit a pull request on github.
How to add support for other languages
- Check out the "i18n" branch from github
- Create a message file named after the iso 2-letter code for your chosen language in the "msg" directory, e.g. msg/eo.
- Always start the file with:
- Copy the $set lines from the en file
- For each message line you wish to translate, use the corresponding number from the "en" file
- The order of substitions in formatting strings (e.g. "%s") may need to change for different languages. Consult the relevant source file to see the context of any message
- It is not necessary to translate every message. If catgets() does not find an appropriate message in its catalogue it will simply print the english default. Any translation is better than no translation so every little helps
- Where another translator has translated only some messages, feel free to augment them
- There is no need to update the Makefile after adding a new language source file: The makefile should generate a message catalgue and install it in the appropriate place when you do a "make install"
The start of an esperanto source file msg/eo might look like:
$quote " $set 1 error.c 1 ": Nekonata eraro" $set 2 kplex.c 1 "Malbona debug-nivelo \"%s\": devas esti 0-9\n" 2 "Ne atribuis memoron"
(with apologies to native Esperanto speakers everywhere...).