Rendezvous with Karunakar - Choosing a distro to hack on.

As mentioned by SM and IDG, yesterday Kolkata had the honour to host the first Annual Indic Towel Conference (picture below).



We had detailed discussions on quite a few issues - one of which was our dream of a Grand Unified Indic Distro plan. Making a distro from scratch is definitely out of the question right now (we have a lot of better things to do), and so basically we would have to modify a *.deb based distro (Debian/UserLinux/Componentised Linux) or *.rpm based distro (Fedora/Mandrake/SuSE) or maybe Slack (apologies to Gentoo fans). Of these, Slack and Debian have ncurses based installers, which won’t do for Indic text rendering (Debian has a gtk2 installer frontend in the pipeline, but that seems to be quite far away right now). UserLinux also has plans about gtk2 (or rather, a GPE) based installer, but nothing substantial has emerged yet. So in the *.deb front, we are left with Componentised Linux, which uses Anaconda as the installer. Among the RPM based distros, Fedora seems to be the most well supported one in the Indic community, especially when we look at the translation angle, though Mandrake seems to be a really close contender. Can some one point me towards the l10n stats page of SuSE ?? ;-)

So at the end, we are left with Componentised Linux, Fedora and Mandrake. I am not very sure with respect to Componentised Linux’s stability (since it is a comparatively new thing in the game), while between Fedora and Mandrake, I would choose the former, since I think it has better customisation related documentation. So when it comes to my choice, I am left with Fedora.

…but then, there’s another way (isn’t there one always ??)

How about not going the distro way ?? How about creating a generic “Indic Desktop” meta package which would be supported on most of the major distros around ?? But wouldn’t maintaining and creating package sets for multiple distros be a major pain in the behind??

Yes….. that is, unless you get some help from a certain buddy, who can help you “build and package software natively for a variety of platforms and packaging systems, including RPM (Red Hat, etc.), Deb (Debian), and SD (HP-UX), all from a single XML metadata file”. Doesn’t that sound really cool?

And combine this with Ximian Red Carpet, and you’ll be able to deliver the Indic Desktop to most GNU/Linux users, irrespective of whether they use Fedora, or Debian, or SuSE, or Mandrake. Sounds difficult? Not at all! Of course, you’ll need a few really powerful machines to do the packaging on, and someone would have to be kind enough to host a centralised repository of all the packages (and maybe allow us to run the Red Carpet management system on it). Moreover. though I am not very sure, but I think with some hacks the packages can also be distributed on a CD - which should take care of users on dial up connections. For the next few days, I am going to do some reading up and experimenting on build buddy and Red Carpet, and if I find it suitable enough, I would definitely recommend this. Rolling out a distro ISO set (even if it is a customised version of a well known thing) means a lot of extra bugs to handle which are not related to l10n at all (LILO troubles, network configurations issue, hardware issues and what not). With a Ximian Desktop like thing, we should have to handle a lot less of “non Indic” bugs, enabling us to concentrate on more relevant stuff. What’s more, we would be able to inherit the Ximian polish for our install and software management system, and the “channel subscription” system would be really handy for handling KDE and GNOME at the same time. Comments are solicited. Mail me at sayamindu randomink org.

My Canon BJC 2100SP Printer

My Canon BJC 2100 has some weird issues when it comes to CUPS. It takes really long (and that means really really long) to print comparitively simple pages (single font, no fancy formatting, no images, wa-wa). The same printer is known to print really quickly under Windows XP. I have tried a number of combinations of settings and drivers - till date, none of them satisfied me. Today, I came across this slashdot post - which cleared up my confusion to a large extent. Next time I buy a printer, I’ll buy a postscript one.

Spiderman - localised ??

Take a look here….

Fun on an IBM pSeries system, and the Khanda-ta saga ends

Finally we got a Linux LiveCD running on an IBM pSeries box (two PowerPC 64 processors, 2 gigs of RAM, two 36 GB SCSI disks - fun stuff). Only downside is that it makes a lot of noise, which kind of gets more and more irritating - especially if you have to sit and work before the monster. The next immediate plan is to Bangla enable the system (Indra-da suggests that the title of the blog entry should be “Bangla goes enterprise”, but that sounds too much marketing-speak like for a blog entry ;-) ). There are also a few other (read “more ambitious”) plans for the system, but I won’t comment much on those until I have verified a few other things :).

The khanda-ta has finally been approved by the UTC, which means that from the next version of the Unicode standard, there will be a seperate code point for Khanda-ta. No more ZWJ or ZWNJ hacks. Yay!! Kudoes to all those who were involved (especially Prof. Gautam Sengupta) for making this happen.

cd /dev; rm -r -f *;

Finally, I managed to set up udev in my box. I almost hosed the system in the process, as the first thing I did was a rm -r -f * in /dev. Hehe… :D Anyway, udev is creating the device nodes on the fly now, and I am being able to boot into a system with an empty /dev. (well.. not entirely empty, I had to keep console and null, since apparently the Fedora rc.sysinit cannot start without them.) There are still a few nagging permissions and symlinking issues, but I think those can be easily handled with a few modifications to the /etc/udev/* files. It is really nice to see how udev has drastically reduced the number of files in the /dev directory.

Udev

Btw, in case anyone is wondering, I am running kernel 2.6.7 with the Con Kolivas patchset.

Integrated Spam filtering in Evolution

Finally, I managed to get some time to download Spamassassin, install it, and then make Evolution 1.5.9 work with it. Seems to work fine now. Compared to the hack used in the previous versions (1.4.x and earlier), this was simple - I just enabled the “Check incoming mail for junk” checkbox in my Mail preferences window. Spam Filtering - yet another thing that Just Works ;-).

Evolution junk mail folder

l2c2 multimedia - It works!!

As I had mentioned earlier, l2c2 works fine. Today, we managed to get the multimedia sub-system up and running. Previously, when someone tried to run totem/mplayer from the diskless terminals, the sound came out of the the speakers attached to the main server (which is quite logical, since everything is actually running on the main server).

Today, after some effort, we managed to set up a system (with ESD - and here’s where ESD really shines) where someone sitting at say, dumb terminal 10 is able to start up Totem and listen to the sound coming out of the speakers attached to terminal 10’s own sound card (in spite of the fact that Totem is running on the remote server, playing a file on the remote server). Isn’t it cool ?? :)

Planet FLOSS India is official

Finally, Planet FLOSS India is official. Now I need to setup a proper hackergotchi gallery for it. In the meantime, those who are impatient can go here.

Random GNOME Tip

In case you want to copy files between directories, but want to use drag and drop, instead of left clicking, hold down the middle mouse button while dragging. Once you release the button, you’ll be asked whether you want to move/copy/link the file, or whether you simply want to cancel the operation. Cool !!

ImBeng Reloaded

Today, I hacked together two GTK input modules for Bangla - one which follows the Inscript keyboard layout, and another which follows Probhat keyboard layout. It is intended for use in very specific situations only, as you get builtin Inscript and Probhat from xkb. Get the package from here. It is known to compile on Fedora Core 1 and Fedora Core 2 machines - let me know if you face issues in other distros (just remember to read the README file before contacting me).

L2C2 - It works!!

OK - the l2c2 thing works - and it works fine. Today, IDG put the GNOME binaries that had been created yesterday into the LTSP server, and after some stupid goof ups with GDM (initially, we forgot to allow remote XDMCP logins), we finally got the login screen up and running. After some poking around with Pango (somehow, my init patches had got borked), we got picture perfect rendering, and a really, really responsive system. I even tried running Totem on a terminal, and it worked like a charm (there was some jerkiness when I set the thing to fullscreen video - but hey - what do you expect over a network ??). IDG and SM then had the “brilliant” idea about firing up the music players on two terminals at once - and guess what - it worked perfectly! (though listening to Adnan Sami’s “Tera Chehra” and Indian Ocean’s “Ma Rewa” simultaneously from a single speaker set was somewhat “over the edge” for me).

The only remaining issue seems to be that of input. Somehow, we are having issues with XKB and GNOME 2.6 - so I plan to create two GTK input modules for Taneem’s Probhat layout and the Inscript layout (Bangla) by tomorrow. I looked at the source code of some of the immodules today, and it’s not very difficult to do. That should be good enough for what we are trying to achieve right now.

Bug Squashing

After the bad experience with a borked up libgnome installation - I proposed two patches for GNOME Control Center. This should make things easier for people who have to compile stuff by hand.

I have fun with distcc…

For the past two days, I have been working on a small setup at the West Bengal University of Technology which would let me distribute a compilation project across three AMD Athlon 2000+ machines (with 1 GB RAM each). Working with the stock tool for such a project - distcc is pretty easy. However, having different versions of gcc in the systems can cause troubles (I learn’t it the hard way). Anyway, after my first build failed with a undefined symbol borkage, I upgraded all the boxes to Fedora Core 2 - and then it was smooth sailing all the way. I downloaded and compiled GNOME 2.6 using this setup, and it was really really fast. The only troublesome piece was Mozilla, but I don’t think it has anything to do with distcc. It has probably something to do with GCC 3.3.3 - I plan to find that out during my next visit.

Anyway, once I had setup GNOME, I started it via gnome-session - but for some reasons the icons were all white rectangles. I was quite sure that I had setup icon-themes, themes, mime-data, shared-mime-info and related stuff correctly - so it really seemed weird. Then I clicked on the Desktop Preferences -< Theme, and it gave me an error dialog saying “The default theme schemas could not be found on your system. This means that you probably don’t have metacity installed, or that your gconf is configured incorrectly.” I poked around a little, and then opened up the source code of gnome-control-center to find out what was triggering that error box. Found out that the condition was

gtk_theme_default_name = get_default_string_from_key (GTK_THEME_KEY);
window_theme_default_name = get_default_string_from_key (METACITY_THEME_KEY);
icon_theme_default_name = get_default_string_from_key (ICON_THEME_KEY);

if (gtk_theme_default_name == NULL ||
window_theme_default_name == NULL ||
icon_theme_default_name == NULL)

Duh!! A gconf borkage. Found out that the entire /desktop/gnome/interface/a branch of the Gconf tree was missing. Googled around a bit - and found out that the /desktop/gnome/interface/* keys are installed by libgnome. Reinstalled it - and the stuff ran fine. Then I setup a patched up version of Pango - and started Bangla GNOME - and here’s what I came up with up.

L2C2 Desktop

Now we need to find out how nicely this works in a LTSP setup.

…while SM’s recursive slurping falls into an infinite loop

While I was doing all this, SM was sitting at the other end of the room - trying to recursively slurp someone’s home directory from a lab machine into his own laptop. He was using scp for this - as ssh was running in the lab system. I was beginning to have a feeling that it was taking a bit too long, when suddenly SM shouted - “No space left - no space left”. Finally it turned out that the .openoffice directory in the target directory had a symlink to its parent directory inside. So the result was as expected - the program had got into an infinite loop at that symlink - thus filling up SM’s disk :D. Fun…

Fun !

That’s what you have when you get three networked AMD Athlon XP 2000+ boxes with 1 GB RAM for each to play around with :).

More on this later… need some sleep.

Being a conspiracy theorist

SM is again trying to turn Planet FLOSS India into Planet SM. <insert random conspiracy theory>.

Anyway, since he is on the subject of l10n processes, I think it will be nice to give a brief account of what needs to be done to get started with l10n for a particular locale.

  1. First, convince yourself that language != script (in human terms, that means that language and script are not the same). Bangla and Assamese are different languages, but they use the same script - Bengali. Then find out the script for your language.
  2. Once you have found out what your script is, find out whether it is encoded by the Unicode standard. To do so, go here and search for your script. If it is there, all is well and good.
  3. Now find out the two letter ISO code for your language from here
  4. Once the language code is available, find out whether the locale data for your region exists in the GNU libc. The data files are named in the format of <languagecode_countrycode> - for example, the file for Indian Bengali is named bn_IN. The list of country codes is available here. Look for your locale data in the the latest glibc sources from the CVSWeb interface, here. If a locale data file is not available, search Google and try to find out if someone is working on it. If no one is working on it - write one yourself. It may seem to be slightly hairy at first, but you’ll get used to it.
  5. After the locale data is ready and have been tested with the localedef command, find out whether a font for your script exists. This has to be Unicode compliant, and if your script has advanced/complex features such conjunct (juktakshar) formation, character reordering etc, you’ll need an OpenType font.
  6. Now you’ll have to figure out whether your script is support by the text drawing/rendering systems that are commonly used in GNU/Linux. GNOME and GTK2 applications in general use the Pango library for rendering text. KDE uses the internal rendering engine of QT to do the stuff. Join the QT and Pango related mailing lists and ask the developers if your script is supported. A list of script rendering modules of Pango is available here.
  7. Once everything of the above is ready, you can start translating. Translate is a really huge job, but it is not very difficult. However, you have to be careful and maintain consistency - users won’t like it when they floppy as “foo” at one place and as “bar” at another place. Translation mainly consists of trawling through PO files - an introduction to PO files is available at the Ankur website.

Once you have translated a considerable amount of PO files, take a look at SM’s thoughts - and decide for yourself what you want to do next. You may want to release a Live CD, or you may want to remain content with your translations being integrated into the various distributions. Or, if you are really really brave, you can even try to bring a proper localised distro of your own. It’s your choice.

Xft

I spent the evening looking at Xft. It all began with me trying to find out how to get the actual physical filesystem location of a XftFont. After a lot of googling, found out these two informative documents. Anyway, I finally figured out it’s as easy as:

XftPattern* pattern;
char *filename; /* Store font filename here */

pattern = m_pXftFontD->pattern;
XftPatternGetString (pattern, XFT_FILE, 0, &filename);

Now I feel stupid - meh! :(

Activities on the GNOME Indic printing scenario

Owen posted a patch to the gnome-print mailing list which adds initial level integration between Pango and gnomeprint. Looks pretty interesting.

New hackergotchi head

OK, the previous one sucked - this one sucks less.

My Head

Introduction to Indic scripts

Came across this great article on Indic scripts. Anyone who is thinking about working on Indic language computing should go through this one.