Bilingva Translation and Interpreting


Bilingva is awesome! Fast, timely, and accurate service!

Ahmad R.
San Jose, CA

[ read more ]

Server-side scripts - a step up

We looked at the simplest localization solution for small static websites in the previous post. Now let's move on to a more complex and flexible solution involving server-side scripts. This approach is used by most sites running a custom CMS engine or using server-side scripts for building and laying out website pages dynamically.

For dynamic pages, we will use the key-value pairs to hold the information we want to present on the website. For example, the title string will be stored under key main_title, and the subtitle for a header on the page will be stored under the key main_subtitle. When a script gets called to present a webpage, it will assemble the output by looking up the requested language from the user using the same key. The language will be set by the user when it picks the language through the site UI.

These key-value paris are stored in so called resource files with every single string used on the website listed inside. The format could be either plain text or XML - it depends on what facilities for reading the files your script language provides. A resource file typically consists of key-value paris and comments with useful instructions for the translator for each pair.

It is important to use descriptive keys and choose a format that allows for comments inside the resource file, because they become extremely useful not only to the coders doing the work, but to translators as well. Remember, that translators usually get just the resource file to work with; they don't always see the product being localized. Descriptive key names and comments help them come up with an accurate and relevant translation.

Bad example:

{codecitation style="brush: xml;"} key1=Hello, World! key2=My first localization attempt. {/codecitation}

Good example:

{codecitation style="brush: xml;"} ; used in the titel of the main page main.title = Hello, World! ; this will show up on the main page in the subheader of the central column main.subtitle = My first localization attempt {/codecitation}

Good example using XML:

{codecitation style="brush: xml;"} main_title Hello, World! used in the titel of the main page main_subtitle My first localization attempt this will show up on the main page in the subheader of the central column {/codecitation}

Parametrizing resource strings

So far so good: we translated the static strings on your website, like title, subtitle, paragraphs, and other contents that does not change. What about dynamic strings? For example, you have a calendar application with a string that tells the user the day of the week:

{codecitation style="brush: xml;"}Today is: Sunday{/codecitation}

We shouldn't create all possible variations of the string that could occur on the website. Instead we add a parameter to the string, and when we localize it, we will include this parameter in the translated string as well. Use some kind of special symbol to denote parameters, for example: $1 for the first, $2 for the second, and so on. Now our resource file could look like this: {codecitation style="brush: xml;"} calendar_greeting Hello! Today is $1, month of $2 $1 is day of week: (Mon ... Sun), $2 is month of the year (Jan...Dec) day_name Monday day of the week month_name January month of the year {/codecitation}

Now translator knows how the string is supposed to look like, and what parameters it takes, so that when it is translated, the sentence structure makes sense, and the parameters themselves are translated as well.

String length

String length is a very important aspect of localization. Translated sentence could be much shorter or longer than its original, and the UI has to adjust for it, or provide guidelines on how long or short the translated strings should be. There's very little a translator can do with single words that can be translated in one way only (dates, geographical names, etc), but with longer sentences there is some leeway in creating sentences of different lengths, trying to stay under the imposed UI limit.

The guidance for sentence length should be included in the comments insides the resource file.

Putting it all together

Now that your resource file is ready, and you had it translated, it's time to arrange the resource files and include the logic in your scripts to pick the right one. It doesn't really matter how you organize the resource files, but a good practice is to name them using international abbreviations for languages and encodings, like en_US, es_ES, and so on. Once the user selects a language, you can persist the choice in the $SESSION variable, and insert the logic in the scripts to automatically pick the right resource file.

That's all! In the next article we'll take a look at localizing CMS systems.

See also:

  • Step 1: localizing static pages