Being a conspiracy theorist

SM is again trying to turn Planet FLOSS India into Planet SM. <insert random conspiracy theory>.

Anyway, since he is on the subject of l10n processes, I think it will be nice to give a brief account of what needs to be done to get started with l10n for a particular locale.

  1. First, convince yourself that language != script (in human terms, that means that language and script are not the same). Bangla and Assamese are different languages, but they use the same script - Bengali. Then find out the script for your language.
  2. Once you have found out what your script is, find out whether it is encoded by the Unicode standard. To do so, go here and search for your script. If it is there, all is well and good.
  3. Now find out the two letter ISO code for your language from here
  4. Once the language code is available, find out whether the locale data for your region exists in the GNU libc. The data files are named in the format of <languagecode_countrycode> - for example, the file for Indian Bengali is named bn_IN. The list of country codes is available here. Look for your locale data in the the latest glibc sources from the CVSWeb interface, here. If a locale data file is not available, search Google and try to find out if someone is working on it. If no one is working on it - write one yourself. It may seem to be slightly hairy at first, but you’ll get used to it.
  5. After the locale data is ready and have been tested with the localedef command, find out whether a font for your script exists. This has to be Unicode compliant, and if your script has advanced/complex features such conjunct (juktakshar) formation, character reordering etc, you’ll need an OpenType font.
  6. Now you’ll have to figure out whether your script is support by the text drawing/rendering systems that are commonly used in GNU/Linux. GNOME and GTK2 applications in general use the Pango library for rendering text. KDE uses the internal rendering engine of QT to do the stuff. Join the QT and Pango related mailing lists and ask the developers if your script is supported. A list of script rendering modules of Pango is available here.
  7. Once everything of the above is ready, you can start translating. Translate is a really huge job, but it is not very difficult. However, you have to be careful and maintain consistency - users won’t like it when they floppy as “foo” at one place and as “bar” at another place. Translation mainly consists of trawling through PO files - an introduction to PO files is available at the Ankur website.

Once you have translated a considerable amount of PO files, take a look at SM’s thoughts - and decide for yourself what you want to do next. You may want to release a Live CD, or you may want to remain content with your translations being integrated into the various distributions. Or, if you are really really brave, you can even try to bring a proper localised distro of your own. It’s your choice.

Commentary

Leave a response »

  1. 1. 4 years, 4 months ago

    Thanks for info Sayamindu. Few doubts….

    What if you want to Localize a language that is not encoded by a Unicode standard? And to complicate things what if the language is based upon multiple scripts?

    Furthurmore, is there any central place where all the PO files for all particular languages can be found to avoid duplication? Let me rephrase: How do you avoid duplication of efforts while working on PO files? Can you use the PO file of a certain application and tweak it to work for another totally different application?

    Again, if you are localizing an internationally used software like OpenOffice.org where do you find the PO file to localize? And what about games that are currently not too high on the ’software-to-be-localized’ list?

    Mayank Sharma
  2. 2. 4 years, 4 months ago

    For your question on where to find the .po files - all projects which have a L10n roadmap have a public CVS for these .pot files. I suggest that you take a quick peek at http://www.plone.org for further details on i18n processes.

    .po and .pot files are generally arranged for each application and then sorted on the language, so you do have a nice compilation

    Sankarshan Mukhopadhyay
  3. 3. 4 years ago

    Please check the sites dedicated to everywhere everyone whereever - Tons of interesdting stuff!!!

    here

Trackbacks

Leave a comment, a trackback from your own site or subscribe to an RSS feed for this entry. Trackback URL for this entry Comments feed for this entry

Leave a response

Leave a URL

Preview