FANDOM


Suleras:I18n navigation

This page gives a technical description of MediaWiki's internationalisation and localisation (I18N) system, and gives hints that coders should be aware of.

Translation resourcesEdit

Translatewiki.netEdit

Translatewiki.net supports in-wiki translation of the complete interface. If you would like to have nothing to do with all the technicalities of editing files, subversion, creating patches, this is the place for you.

Finding messagesEdit

Manual:System message explains how to find a particular string you want to translate; in particular, note the new qqx trick.

Subversion and BugzillaEdit

Only few languages (some Chinese languages) are maintained by translators who commit directly to MediaWiki SVN repository. All new efforts should go through the previously introduced translatewiki.net

I18n mailing listEdit

You can subscribe to the i18n list; at the moment it is very low-traffic.

Code structureEdit

First, you have a Language object in Language.php. This object contains all the localisable message strings, as well as other important language-specific settings and custom behavior (uppercasing, lowercasing, printing dates, formatting numbers, etc.)

The object is constructed from two sources: subclassed versions of itself (classes) and Message files (messages).

There's also the MessageCache class, which handles input of text via the MediaWiki namespace. And there's the wfMsg*() functions in GlobalFunctions.php. We have large amounts of message retrieval code in GlobalFunctions.php.

General use (for developers)Edit

Language objectsEdit

There are two ways to get a language object. You can use the globals $wgLang and $wgContLang for user interface and content language respectively. For an arbitrary language you can construct an object by using Language::factory( 'en' ), by replacing en with the code of the language. The list of codes is in languages/Names.php.

Language objects are needed for doing language specific functions, most often to do number, time and date formatting, but also to construct lists and other things. There are multiple layers of caching and merging with fallback languages, but the details are irrelevant in normal use.

Using messagesEdit

MediaWiki uses a central repository of messages which are referenced by keys in the code. This is different from, for example, Gettext, which just extracts the translatable strings from the source files. The key-based system makes some things easier, like refining the original texts and tracking changes to messages. The drawback is of course that the list of used messages and the list of source texts for those keys can get out of sync. In practise this isn't a big problem, sometimes extra messages which are not used anymore still stay up for translation.

The message system in MediaWiki is quite complex, a bit too complex. One of the reasons for this is that MediaWiki is a web application. Messages can go through all kinds of processing. The four major ones covering almost all cases are:

  1. as-is, no processing at all
  2. light wiki-parsing, parserfunction references starting with {{ are replaced with their results
  3. full wiki-parsing

Case 1. is for processing, not really for user visible messages. Light wiki-parsing should always be combined with html-escaping.

Recommended waysEdit

Longer messages that are not used hundreds of times on a page:

  • OutputPage::addWikiMsg
  • OutputPage::wrapWikiMsg
  • wfMessage()

OutputPage methods parse messages and add them directly to the output buffer. wfMessage can be used when a message should not be added to the output buffer. ->parse() removes enclosing html tags from the parsed result, usually <p>..</p>, but can generate invalid code for example if there is no root tag in parsed result, for example <p>..</p><p>..</p>. Usage examples:

$out->addWikiMsg( 'foobar', $user->formatNum( count( $items ) ) );
$out->wrapWikiMsg( '<div class="baz">\n$1\n</div>', array( 'foobar', $user->getName() ) );
$text = wfMessage( 'foobar', $language->date( $ts ) )->parse();

Other messages with light wiki-parsing can use wfMsg and wfMessage with ->text(). wfMessage should always be used if the message has parts that depend on linguistic information, like {{PLURAL:$1}}. Do not use wfMsg, wfMsgHtml for those kind of messages! They seem to work but are broken.

$out = Xml::submitButton( wfMsg( 'foobar' ) ); # no linguistic information
$out = Xml::label( wfMessage( 'foobar', $wgLang->formatNum( $count ) )->text() ); # uses plural on $count

Some messages have mixed escaping and parsing. Most commonly when using raw links in messages that should not be escaped. The preferred way is to use wfMessage with ->rawParams() for the affected parameters. Be especially wary of using wfMsgHtml, it only escapes the message, not parameters. This has caused at least one XSS in MediaWiki.

Short list of functions to avoid:

  • wfMsgHtml (don't use unless you really want unescaped parameters)
  • wfMsgWikiHtml (breaks up linguistic functions, as does wfMsg)
  • OutputPage::parse and parseInline, addWikiText (if you know the message, use addWikiMsg or wrapWikiMsg)

Remember that almost all Xml:: and Html::-functions escape everything fed into them, so avoid double-escaping and parsed text with those.

Using messages in JavaScript Edit

To use the messages in client side, we need to use resourceloader to make sure that the messages are available at client side first. For this, in your resource loader modules, define the messages to be exported to client side.

Example:

$wgResourceModules['ext.foobar.core'] = array(
'scripts' => array( 'resources/ext.foobar.js'),
'styles' => 'resources/ext.extension.css',
'localBasePath' => $dir,
'remoteExtPath' => 'FooBar',
'messages' => array(
'message-key-foo',
'message-key-bar',
),
);

The messages defined in the above example message-ke-foo, message-key-bar will be available at client side and can be accessed by mw.msg( 'message-key-foo'). Se the example given below:

$( '<a>' ).prop( 'href', '#' ).text( mw.msg( 'message-key-foo') );

We can also pass the dynamic parameters to the message(ie the values for $1, $2) etc) as shown below.

$( '<a>' ).prop( 'href', '#' ).text( mw.msg( 'message-key-foo', value1, value2 ) );

In the above examples, note that the message should be defined in an i18n.php file. If the messagekey is not found in any i18n.php file, the result of mw.msg will be the message key in agnle brackets - like <message-key-foo>.

When using localization messages, be sure to always make sure it is properly escaped to prevent potential html injections as well as preventing malformed markup with special characters.

  • If using jQuery's .html, use .text( mw.msg( ... ) ) instead of .html( mw.msg( ... ) ). jQuery will make sure to set the elements' inner text value instead of the raw html. This is the best option and is also fastest in performance because it avoids escaping all together because .text() goes almost straight into the browser, removing the need for escaping.
  • If using jQuery's .append, escape manually .append( '<li>' + mw.message( 'example' ).escaped() + '</li>' );
  • If manually building an html string, escape manually by creating a message object and calling .escaped() (instead of the mw.msg shortcut, which does mw.message(key).plain() ):
    '<foo>' + mw.message( 'example' ).escaped() + '</foo>';

PLURAL and GENDER support in JavaScriptEdit

Mediawiki 1.19 onwards, the messages for JavaScript can contain PLURAL and GENDER directives. This feature is optional and extensions which require this feature should define an additional dependency mediawiki.jqueryMsg in the resourceloader module definition.

If you have a message , say, 'message-key-plural-foo' => 'There {{PLURAL:$1|is|are}} $1 {{PLURAL:$1|item|items}}' , in JavaScript , you can use it as given below:

mw.msg( 'message-key-plural-foo', count ) ;
// returns There is 1 result if count =1
// returns There are 6 results if count = 6

If you have a message , say, 'message-key-gender-foo' => '{{GENDER:$1|he|she}} created an article' , in JavaScript, you can use it as given below:

mw.msg( 'message-key-gender-foo', 'male' ) ; // returns 'he created an article'
mw.msg( 'message-key-gender-foo', 'female' ) ; // returns 'she created an article'

Instead of passing the gender directly, we can pass an user object - ie mw.User object with a gender attribute to mw.msg. For eg, the current user object.

var user = mw.user; //current user
mw.msg( 'message-key-gender-foo', user ) ; // The message returned will be based on the gender of the current user.

If the gender passed to mw.msg is invalid or unknown, gender neutral form will be used as defined for each language.

The keywords GENDER, PLURAL are case insensitive.

GRAMMAR in JavaScriptEdit

Mediawiki 1.20 onwards, the messages for JavaScript can contain GRAMMAR directive. This feature is optional and extensions which require this feature should define an additional dependency mediawiki.language.data in the resourceloader module definition.

The static grammar form rules can be defined in $wgGrammarForms gloabl. The dynamic language specific grammar rules in PHP has been ported to javascript. Once the dependency mediawiki.language.data iis added mw.msg method can be used as usual to parse the messages with word where N is the name of the grammatical form needed and word is the word being operated on. More information about Grammar is available here

Adding new messagesEdit

  1. Decide a name (key) for the message. Try to follow global or local conventions for naming. For extensions, use a standard prefix, preferably the extension name in lower case, followed by a hyphen ("-"). Try to stick to lower case letters, numbers and dashes in message names; most others are between less practical or not working at all. See also Manual:Coding conventions#Messages.
  2. Make sure that you are using suitable handling for the message (parsing, {{-replacement, escaping for HTML, etc.)
  3. Add it to languages/messages/MessageEn.php (core) or your extensions i18n file under 'en'.
  4. Take a pause and consider the wording of the message. Is it as clear as possible? Can it be understood wrong? Ask comments from other developers or from localizers if possible. Follow the #internationalization hints.
  5. Add documentation to MessagesQqq.php or your extensions i18n file under 'qqq'. Read more about #message documentation.
  6. If you added a message to core, add the message key also to maintenance/language/messages.inc (also add the section if you created a new one). This file will define the order and formatting of messages in all message files.

Removing existing messagesEdit

  1. Remove it from MessagesEn.php. Don't bother with other languages - updates from translatewiki.net will handle those automatically.
  2. Remove it from maintenance/language/messages.inc

Step 2 is not needed for extensions, so you only have to remove your English language messages from ExtensionName.i18n.php.

Changing existing messagesEdit

  1. Consider updating the message documentation (see Adding new messages).
  2. Change the message key if old translations are not suitable for the new meaning. This also includes changes in message handling (parsing, escaping). If in doubt, ask in #mediawiki-i18n or in the Support page at translatewiki.net.
  3. If the extension is supported by translatewiki, please only change the English source message and/or key. The internationalisation and localisation team will take care of updating the translations, marking them as outdated, cleaning up the file or renaming keys where possible. This also applies when you're only changing things like HTML tags that you could change in other languages without speaking those languages.

Localizing namespaces and special page aliasesEdit

Namespaces and special page names (i.e. RecentChanges in Special:RecentChanges) are also translatable.

NamespacesEdit

To allow custom namespaces introduced by your extension to be translated, create a MyExtension.namespaces.php file that looks like this:

<?php
/**
* Translations of the namespaces introduced by MyExtension.
*
* @file
*/
 
$namespaceNames = array();
 
// For wikis where the MyExtension extension is not installed.
if( !defined( 'NS_MYEXTENSION' ) ) {
define( 'NS_MYEXTENSION', 2510 );
}
 
if( !defined( 'NS_MYEXTENSION_TALK' ) ) {
define( 'NS_MYEXTENSION_TALK', 2511 );
}
 
/** English */
$namespaceNames['en'] = array(
NS_MYEXTENSION => 'MyNamespace',
NS_MYEXTENSION_TALK => 'MyNamespace_talk',
);
 
/** Finnish (Suomi) */
$namespaceNames['fi'] = array(
NS_MYEXTENSION => 'Nimiavaruuteni',
NS_MYEXTENSION_TALK => 'Keskustelu_nimiavaruudestani',
);

Then load the namespace translation file in MyExtension.php via $wgExtensionMessagesFiles['MyExtensionNamespaces'] = dirname( __FILE__ ) . '/MyExtension.namespaces.php';

When a user installs MyExtension on their Finnish (fi) wiki, the custom namespace will be translated into Finnish magically, and the user doesn't need to do a thing!

Special page aliasesEdit

Create a new file for the special page aliases in this format:

<?php
/**
* Aliases for the MyExtension extension.
*
* @file
* @ingroup Extensions
*/
 
$aliases = array();
 
/** English */
$aliases['en'] = array(
'MyExtension' => array( 'MyExtension' )
);
 
/** Finnish (Suomi) */
$aliases['fi'] = array(
'MyExtension' => array( 'Lisäosani' )
);

Then load it in the extension's setup file like this: $wgExtensionAliasesFiles['MyExtension'] = dirname( __FILE__ ) . '/MyExtension.alias.php';

When your special page code uses either SpecialPage::getTitleFor( 'MyExtension' ) or $this->getTitle() (in the class that provides Special:MyExtension), the localized alias will be used, if it's available.

Message parametersEdit

Some messages take parameters. They are represented by $1, $2, $3, … in the (static) message texts, and replaced at run time. Typical parameter values are numbers ("Delete 3 versions?"), or user names ("Page last edited by $1"), page names, links, and so on, or sometimes other messages. They can be of arbitrary complexity.

Switches in messages…Edit

Parameters values at times influence the exact wording, or grammatical variations in messages. Not resorting to ugly constructs like "$1 (sub)page(s) of his/her userpage", we make switches depending on values known at run time. The (static) message text then supplies each of the possible choices in a list, preceded by the name of the switch, and a reference to the value making a difference. This very much resembles the way, parser functions are called in MediaWiki. Several types of switches are available. These only work if you do full parsing or {{-transformation for the messages.

…on numbers via PLURALEdit

MediaWiki supports plurals, which makes for a nicer-looking product. For example:

'undelete_short' => 'Undelete {{PLURAL:$1|one edit|$1 edits}}',

Language-specific implementations of PLURAL: are found in pages such as LanguageFr.php (for French; uses singular for 0, code fr) or LanguageCs.php (for Czech, Polish and some other Slavic languages, code cs). You should not expect PLURAL to handle fractional numbers (like 44.6) — see bugzilla:28128.

…on user names via GENDEREdit

# languages/messages/MessagesEn.php
'blocklog-showlog' => 'This user has been blocked previously.'
 
# languages/messages/MessagesRu.php
'blocklog-showlog' => '{{GENDER:$1|Этот участник уже блокировался|Эта участница уже блокировалась}} ранее.'

If you refer to an user in a message, pass the user name as parameter to the message and add a mention in the message documentation that gender is supported.

…on use context inside sentences via GRAMMAREdit

Grammatical transformations for agglutinative languages is also available. For example for Finnish, where it was an absolute necessity to make language files site-independent, i.e. to remove the Wikipedia references. In Finnish, "about Wikipedia" becomes "Tietoja Wikipediasta" and "you can upload it to Wikipedia" becomes "Voit tallentaa tiedoston Wikipediaan". Suffixes are added depending on how the word is used, plus minor modifications to the base. There is a long list of exceptions, but since only a few words needed to be translated, such as the site name, we didn't need to include it.

MediaWiki has grammatical transformation functions for over 20 languages. Some of these are just dictionaries for Wikimedia site names, but others have simple algorithms which will fail for all but the most common cases.

Even before MediaWiki had arbitrary grammatical transformation, it had a nominative/genitive distinction for month names. This distinction is necessary if you wish to substitute month names into sentences.

Filtering special characters in parameters and messagesEdit

The other (much simpler) issue with parameter substitution is HTML escaping. Despite being much simpler, MediaWiki does a pretty poor job of it. We have a plethora of poorly-named wfMsg*() functions, including the multitasking wfMsgExt(), with lots of ways to slip up and let through unescaped user input. There may be work done to clean this up at some stage in the future.

Message documentationEdit

There is a pseudo-language, having the code qqq (message documentation). It is one of the ISO 639 codes reserved for private use. There we do not keep translations of each message, but collect English sentences about each message; telling us where it is used, giving hints about how to translate it, enumerate and describe its parameters, link to related messages, etc.. In translatewiki.net, these hints are shown to translators when they edit messages.

Programmers must play a role in message documentation. Whenever a message is added to the software, a corresponding qqq entry must be added as well; revisions which don't do so are marked fixme until the documentation is added (if it's only a couple messages, it might be easier for the developer to add via the translatewiki.net translation interface). Useful information includes:

  1. message handling (parsing, escaping, plain text)
  2. type of parameters with example values
  3. where the message is used (pages, locations in the user interface)
  4. how the message is used where it is used (a page title, button text, etc..)
  5. what other messages are used together with this message, or which other messages this message refers to
  6. anything else which could be understood when the message is seen on the context, but not when the message is displayed alone (which is the case when it is being translated)
  7. if the message has special properties, like is a page name, or if it should not be a direct translation
  8. parts of the message which must not be translated, such as generic namespace names or URLs
  9. You can link to other messages by using {{msg-mw|message key}}. Please do this if parts of the messages come from other messages (if it cannot be avoided), or if some messages are shown together or in same context.

Translatewiki provides some default templates for documentation:

  • {{doc-action|[...]}} for action- messages
  • {{doc-right|[...]}} for right- messages
  • {{doc-group|[...]|[...]}} for messages around user groups (group, member, page, js and css)
  • {{doc-accesskey|[...]}} for accesskey- messages

Have a look at the template page for more information.

Internationalization hintsEdit

Besides documentation, translators ask to consider some hints so as to make their work easier and more efficient and to allow an actual and good localisation for all langauges. Even if only adding or editing messages in English, one should be aware of the needs of all languages. Messages are translated to more than 300 languages each, which should be done in the best possible way.

There a two main places, where you can find assistance of experienced and knowledgeable people regarding I18n:

Please do ask them.

Avoid message reuseEdit

The translators encourage reuse avoidance. Although two concepts can be expressed with the same word in English, this doesn't mean they can be expressed with the same word in every language. "OK" is a good example: in English this is used for a generic button label, but in some languages they prefer to use a button label related to the operation which will be performed by the button. If you are adding multiple identical messages, please add message documentation to describe the differences in their contexts. Don't worry too much about the extra work for translators. Translation memory helps a lot in these while keeping the flexibility to have different translations if needed.

Avoid patchwork messagesEdit

Languages have varying word orders, and complex grammatical and syntactic rules. Messages put together from lots of pieces of text, possibly with some indirection, are very hard, if not impossible, to translate. Better make messages complete sentences each, with a full stop at the end. Several sentences can usually much more easily be combined into a text block, if needed.

Messages quoting each otherEdit

An exception from the rule may be messages referring to one another: Enter the original authors name in the field labelled "{{int:name}}"' and click "{{int:proceed}}" when done. It is safe when a wiki operator alters the messages "name" or "proceed". Without the int-hack, operators would have to be aware of all related messages needing adjustment, when they alter one.

Be aware of PLURAL use on all numbersEdit

See also Plural, #PLURAL and GENDER support in JavaScript, #…on numbers via PLURAL.

When a number has to be inserted into a message text, be aware that, some languages will have to use PLURAL on it even if always larger than 1. The reason is that PLURAL in languages other than English can make very different and complex distinctions, comparable to English 1st, 2nd, 3rd, 4th, … 11th, 12th, 13th, … 21st, 22nd, 23rd, … etc.

Do not try to supply three different messages for cases like 0, 1, more items counted. Rather let one message take them all, and leave it to translators and PLURAL to properly treat possible differences of presenting them in their respective languages.

Always include the number as a parameter if possible. Always add {{PLURAL:}} syntax to the source messages if possible, even if it makes no sense in English. The syntax guides translators.

You should not expect PLURAL to handle fractional numbers (like 44.5), so it's probably a good idea to round the number to the nearest integer if PLURAL is necessary in the context (bugzilla:28128).

Pass number of list items as parameters to messages talking about listsEdit

At least one language has to use grammar varying with the number of list items when expressing what is listed in a list visible to readers. Thus, whenever your code computes a list, include count($list) as parameter to headlines, lead-ins, footers and other messages about the list, even if the count is not used in English. There is a neutral way to talk about invisible lists, so you can have links to lists on extra pages without having to count items in advance.

Separate times from dates in sentencesEdit

Some languages have to insert something between a date and a time which grammatically depends on other words in a sentence. Thus they will not be able to use date/time combined. Others may find the combination convenient, thus it is usually the best choice to supply three parameter values (date/time, date, time) in such cases.

Users have grammatical gendersEdit

See also Gender, #PLURAL and GENDER support in JavaScript, #…on user names via GENDER.

When a message talks about a user, or relates to a user, or addresses a user directly, the user name should be passed to the message as a parameter. Thus languages having to, or wanting to, use proper gender dependent grammar, can do so. This should be done even when the user name is not intended to appear in the message, such as in "inform the user on his/her talk page", which is better made "inform the user on {{GENDER:$1|his|her|their}} talk page" in English as well.

This doesn't mean that you're encouraged to "sexualize" messages' language: please use gender-neutral language where this can be done with clarity and precision.

Avoid {{SITENAME}} in messagesEdit

{{SITENAME}} has several disadvantages. It can be anything (acronym, word, short phrase, etc.) and, depending on language, may need {{GRAMMAR}} on each occurrence. No matter what, very likely in most wiki languages, each message having {{SITENAME}} will need review for each new wiki installed. When there is not a general GRAMMAR program for a language, as almost always, sysops will have to add or amend php code so as to get {{GRAMMAR}} for {{SITENAME}} working. This requires both more skills, and more understanding, than otherwise. It is more convenient to have generic references like "this wiki". This does not keep installations from altering these messages to use {{SITENAME}}, but at least they don't have to, and they can postpone message adaption until the wiki is already running and used.

Avoid references to screen layout and positionsEdit

What is rendered where depends on skins. Most often screen layouts of languages written from left to right are mirrored compared to those used for languages written from right to left, but not always, and for some languages and wikis, not entirely. Handheld devices, narrow windows, and so on show blocks underneath each other, that appear side to side on large displays. Since user selected and user written javascript gadgets can, and do, hide parts, or move things around in unpredictable ways, there is no reliable way of knowing the actual screen layout.

It is wrong to tie layout information to languages, since the user language may not be the wiki language, and layout is taken from wiki languages, not user languages, unless wiki operators choose to use their home made layout anyways. Acoustic screen readers, and other auxiliary devices do not even have a concept of layout. So, you cannot refer to layout positions in the majority of cases.

We do not currently have a way to branch on wiki directionality (bug 28997)

The upcoming browser support for East and North Asian top-down writing[1] will make screen layouts even more unprecitable.

Have message elements before and after input fieldsEdit

This rule has yet to become de facto standard in MediaWiki development

While English allows efficient use of prompting in the form "item colon space input-field", many other languages don't. Even in English, you often want to use "Distance: ___ feet" rather than "Distance (in feet): ___". Leaving <textarea> aside, just think of each and every input field following the "Distance: ___ feet" pattern. So:

  • give it two messages, even if the 2nd one is most often empty in English, or
  • allow the placement of inputs via $i parameters.

Avoid untranslated HTML markup in messagesEdit

HTML markup not requiring translation, such as enclosing <div>s, rulers above or below, and similar, should usually better not be part of messages. They unnecessarily burden translators, increase message file size, and pose the risk to accidentally being altered in the translation process.

Messages are usually longer than you think!Edit

Skimming foreign language message files, you find messages almost never shorter than Chinese ones, rarely shorter than English ones, and most usually much longer than English ones.

Especially in forms, in front of input fields, English messages tend to be terse, and short. That is often not kept in translations. Especially genuinely un-technical third world languages, vernacular, medieval, or ancient languages require multiple words or even complete sentences to explain foreign, or technical, prompts. E.g. "TSV file:" may have to be translated as: "Please type a name here which denotes a collection of computer data that is comprised of a sequentially organized series of typewritten lines which themselves are organized as a series of informational fields each, where said fields of information are fenced, and the fences between them are single signs of the kind that slips a typewriter carriage forward to the next predefined position each. Here we go: _____ (thank you)" — admittedly an extreme example, but you got the trait. Imagine this sentence in a column in a form where each word occupies a line of its own, and the input field is vertically centered in the next column. :-(

Avoid using very close, similar, or identical words to denote different things, or conceptsEdit

For example, pages may have older revisions (of a specific date, time, and edit), comprising past versions of said page. The words revision, and version can be used interchangeably. A problem arises, when versioned pages are revised, and the revision, i.e. the process of revising them, is being mentioned, too. This may not pose a serious problem when the two synonyms of "revision" have different translations. Do not rely on that, however. Better is to avoid the use of "revision" aka "version" altogether, then, so as to avoid it being misinterpreted.

Basic words may have unforeseen connotations, or not exist at allEdit

There are some words that are hard to translate because of their very specific use in MediaWiki. Some may not be translated at all. For example "namespace", and "apartment", translate the same in Kölsch. There is no word "user" relating to "to use something" in several languages. Sticking to Kölsch, they say "corroborator and participant" in one word since any reference to "use" would too strongly imply "abuse" as well. The term "wiki farm" is translated as "stable full of wikis", since a single crop farm would be a contradiction in terms in the language, and not understood, etc.

Expect untranslated wordsEdit

This rule has not yet become de facto standard in MediaWiki development

It is not uncommon that computerese English is not translated and taken as loanwords, or foreign words. In the latter case, technically correct translations mark them as belonging to another language, usually with apropriate html markup, such as <span lang="en"></span>. Thus make sure that, your message output handler passes it along unmolested, even if you do not need it in English, or in your language.

Permit explanatory inline markupEdit

This rule has yet to become de facto standard in MediaWiki development

Sometimes there are abbreviations, technical terms, or generally ambiguous words in target languages that may not be immediately understood by newcomers, but are obvious to experienced computer users. So as not to create lengthy explanations causing screen clutter, it may be advisable to have them as annotations shown by browsers when you move the mouse over them, such as in:

mḍwwer 90° <abbr title="Ĝks (ṫ-ṫijah) Ĝaqarib s-Saĝa">ĜĜS</abbr>

giving:

mḍwwer 90° ĜĜS

explaining the abbreviation for "counter clockwise" when needed. Thus make sure, your output handler accepts them, even if the original message does not use them.

Symbols, colons, brackets, etc. are parts of messagesEdit

Many symbols are translated, too. Some scripts have other kinds of brackets than the Latin script has. A colon may not be appropriate after a label or input prompt in some languages. Having those symbols included in messages helps to better and less anglo-centric translations, and by the way reduces code clutter.

Do not expect symbols and punctuation to survive translationEdit

Languages written from right to left (as opposed to English) usually swap arrow symbols being presented with "next" and "previous" links, and their placement relative to a message text may, or may not, be inverted as well. Ellipsis may be translated to "etc." or to words. Question marks, exclamation marks, colons do appear at other places than at the end of sentences, or not at all, or twice. As a consequence, always include all of those in your messages, never insert them programmatically.

Use full stopsEdit

Do terminate normal sentences with full stops. This is often the only indicator for a translator to know that they are not headlines or list items, which may need to be translated differently.

Link anchorsEdit

Wikicode of linksEdit

Link anchors can be put into messages in several technical ways:

  1. via wikitext: … [[a wiki page|anchor]] …
  2. via wikitext: … [some-url anchor] …
  3. the anchor text is a message in the MediaWiki namespace. Avoid it!

The latter is often hard or impossible to handle for translators, avoid patchwork messages here, too. Make sure that "some-url" does not contain spaces.

Use meaningful link anchorsEdit

Care for your wording. Link anchors play an important role in search engine assessment of pages, both the linking ones, and the ones linked to. Make sure that, the anchor describes the target page well. Do avoid commonplace and generic words! For example, "Click here" is an absolute nogo, since target pages never are about "click here". Do not put that in sentences around links either, because "here" was not the place to click. Use precise words telling what a user will get to when following the link, such as "You can upload a file if you wish."

Avoid jargon and slangEdit

Avoid developer and power user jargon in messages. Try to use as simple language as possible.

One sentence per lineEdit

Try to have one sentence or similar block in one line. This helps to compare the messages in different languages, and may be used as an hint for segmentation and alignment in translation memories.

Be aware of whitespace and line breaksEdit

MediaWiki's localized messages usually get edited within the wiki, either by admins on live wikis or by the translators on translatewiki.net. You should be aware of how whitespace, especially at the beginning or end of your message, will affect editors:

  • Newlines at the beginning or end of a message are fragile, and will be frequently removed by accident. Start and end your message with active text; if you need a newline or paragraph break around it, your surrounding code should deal with adding it to the returned text.
  • Spaces at the beginning or end of a message are also likely to end up being removed during editing, and should be avoided. If a space is required for output, usually your code should be appending it or else you should be using a non-breaking space such as &nbsp; (in which case check your escaping settings!)
  • Try to use literal newlines rather than "\n" characters in the message files; while \n works in double-quoted strings, the file will be formatted more consistently if you stay literal.

Use standard capitalizationEdit

Capitalization gives hints to translators as to what they are translating, such as single words, list or menu items, phrases, or full sentences. Correct (standard) capitalization may also play a role in search engine assessment of your pages. If you really need to emphasise something with capitals, use CSS styles to do so. For instance, the HTML attributes style="text-transform:uppercase" (uppercase) or style="font-variant:small-caps" (Small Caps) will do. Since these may be adjusted to something else during translation, most specifically for non-Latin scripts, they need to be part of the messages and must not be added programmatically.

EmphasisEdit

In normal text, emphasis like boldface or italics and similar should be part of message texts. Local conventions on emphasis often vary, especially some Asian scripts have their own. Translators must be able to adjust emphasis to their target languages and areas.

Overview of the localisation systemEdit

Message sourcesEdit

Messages are obtained from these sources:

  • The MediaWiki namespace. It allows wikis to adopt, or override, all of their messages, when standard messages do not fit or are not desired (see #Old local translation system).
    • MediaWiki:Message-name is the default message,
    • MediaWiki:Message-name/language-code is the message to be used when a user has selected a language other then the wikis default language.
  • From message files.
    • MediaWiki itself, and few extensions, use a file per language, called MessagesZxx.php, where zxx is the language code for the language.
    • Most extensions use a combined message file holding all messages in all languages, usually named after the extension, and having an .i18n.php ending.
    • Very few extensions are using another, individual way.

Update of localisationEdit

As said above, translation happens on translatewiki.net and other systems are discouraged. Wikimedia projects and any other wikis can benefit immediately and automatically from localisation work thanks to the LocalisationUpdate extension, which works through the localisation cache and for instance on Wikimedia projects updates it daily (see also the technical details about the specific implementation). Because changes on translatewiki.net are pushed to the code daily as well, this means that each change to a message can potentially be applied to all existing MediaWiki installations in a couple days without any manual intervention or traumatic code update.

CachingEdit

MediaWiki has lots of caching mechanisms built in, which make the code somewhat more difficult to understand. Since 1.16 there is a new caching system, which caches messages either in .cdb files or in the database. Customised messages are cached in the filesystem and in memcached (or alternative), depending on the configuration.

See also Manual:$wgLocalisationCacheConf.

LicenseEdit

Any edits made to the language must be licensed under the terms of the GNU General Public License (and GFDL?) to be included in the MediaWiki software.

Old local translation systemEdit

With MediaWiki 1.3.0 a new system was set up for localizing MediaWiki. Instead of editing the language file and asking developers to apply the change, users can edit the interface strings directly from their wikis. This is the system in use as of August 2005. People can find the message they want to translate in Special:AllMessages and then edit the relevant string in the MediaWiki: namespace. Once edited, these changes are live. There is no more need to request an update, and wait for developers to check and update the file.

The system is great for Wikipedia projects; however a side effect is that the MediaWiki language files shipped with the software are no longer quite up-to-date, and it is harder for developers to keep the files on meta in sync with the real language files.

As the default language files do not provide enough translated material, we face two problems:

  1. New Wikimedia projects created in a language which has not been updated for a long time, need a total re-translation of the interface.
  2. Other users of MediaWiki (including Wikimedia projects in the same language) are left with untranslated interfaces. This is especially unfortunate for the smaller languages which don't have many translators.

This is not such a big issue anymore, because translatewiki.net is advertised prominently and used by almost all translations. Local translations still do happen sometimes but they're strongly discouraged. Local messages mostly have to be deleted, moving the relevant translations to translatewiki.net and leaving on the wiki only the site-specific customisation; there's a huge backlog especially in older projects, this tool helps with cleanup.

Keeping messages centralized and in syncEdit

English messages are very rarely out of sync with the code. Experience has shown that it's convenient to have all the English messages in the same place. Revising the English text can be done without reference to the code, just like translation can. Programmers sometimes make very poor choices for the default text.

AppendixEdit

What can be localizedEdit

  • Namespaces (both core and extensions', plus gender-dependant user namespaces)
  • Weekdays (and abbrev)
  • Months (and abbrev)
  • Bookstores
  • Skin names
  • Math names
  • Date preferences
  • Date format
  • Default date format
  • Date preference migration map
  • Default user option overrides
  • Language names
  • Timezones
  • Character encoding conversion via iconv
  • UpperLowerCase first (needs casemaps for some)
  • UpperLowerCase
  • Uppercase words
  • Uppercase word breaks
  • Case folding
  • Strip punctuation for MySQL search
  • Get first character
  • Alternate encoding
  • Recoding for edit (and then recode input)
  • RTL
  • Direction mark character depending on RTL
  • Arrow depending on RTL
  • Languages where italics cannot be used
  • Number formatting (commafy, transform digits, transform separators)
  • Truncate (multibyte)
  • Grammar conversions for inflected languages
  • Plural transformations
  • Formatting expiry times
  • Segmenting for diffs (Chinese)
  • Convert to variants of language
  • Language specific user preference options
  • Link trails, e.g.: [[foo]]bar
  • Language code (RFC 3066)

Neat functionality:

  • I18N sprintfDate
  • Roman numeral formatting

MissingEdit

This section is missing about the changes in the i18n system related to extensions. The format was standardized and messages are automatically loaded. See Message sources.

ReferencesEdit

  1. http://dev.w3.org/csswg/css3-writing-modes/

See alsoEdit

Suleras:Languages

A source reference lifted electronically from the internet from website:

http://www.mediawiki.org/w/index.php?title=Localisation&action=edit

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.

Also on FANDOM

Random Wiki