Stuff that I have been up to

December turned out to be a pretty busy month for me – here are some of the stuff I have been involved in/working on:

  • FOSS.in: As always FOSS.in ‘09 turned out to be an amazing affair. Being someone who works remotely, this event is probably one of the best opportunities for me to have “real” interactions. It’s a place where I can simply sit down, have long face to face conversations, come up with new ideas, be inspired, and most importantly, have fun. My heartfelt thanks goes out to the people behind the event for making this possible. I have some photos in this Flickr photoset.
  • Book reader: This month’s priority has been stabilizing the Sugarlabs/OLPC book-reader code, and a large number of important bugfixes landed during the last few weeks. More in this status report.
  • Arduino: At FOSS.IN, thanks to the efforts of the ever enthusiastic Kushal Das, I managed to get hold of an Arduino clone board (it is terribly difficult to get hold of one in Kolkata). I had heard of Arduino before and wanted to get one, and the session on it at FOSS.in by Russell Nelson finally served as the “kick” which made Kushal and me call up the local distributor and get a couple of boards for ourselves. I have been playing around with sensors support in Sugar for sometime (I helped make the Measure activity work on XO 1.5 hardware), and realized that this would be yet another interesting way to connect Sugar with the “real” world. So after a couple of weekends worth of work, I got Arduino support in Turtle Art.


    Turtle Art with Arduino

  • XO keyboards: There may be a new AZERTY keyboard for the XO laptops very soon. See this wikipage for details.
  • Pootle: The Pootle developers have released version 2.0, which is a vastly improved edition compared to the previous releases. I have been testing it out with plans to upgrade the Sugarlabs/OLPC translation server soon. While testing, I added a quick (and ugly) hack to implement msgfmt –check style syntax checking in Pootle. This would definitely make the process of integrating the translations with the upstream code much less painful – and here’s a screenshot (click on it for a larger version):


    Gettext syntax check in Pootle

Making Books Available

Its all over the web now – the Internet Archive has opened up over 1.6 million books for the OLPC XO laptops and in general, any machine running Sugar. Before going into anything else, it makes sense to provide a more specific meaning of “opening up” here – it involves two main objectives completed at the Internet Archive end:

  • Making sure that the books are readable in the XO, keeping in mind its relative low-end hardware specs and disk-space limitations
  • Ensuring that the books are available via a standardized catalog format, so that one can find, browse and download books easily using a tool more tuned for the purpose (think of feed-readers versus blog-entries in a web-page)

Now that the books are available (not just from the Internet Archive, but from a number of other sources as well), the next step is to figure out the best possible ways to actually make these books available to the XO and Sugar users. The major constraining factor is bandwidth, we do have deployments with zero, or very limited Internet connectivity, and perhaps these are the deployments which need access to these books the most. I spent most of this week working on implementing a feature in the Get Books activity which would allow books to be distributed via what has been jokingly called a sneaker-net (or sandalnet/chappalnet, if you prefer those forms of footwear). The idea is very simple – at a centralized location with Internet access, choose a few thousand books (size of a typical book is usually a few hundred KB or less), put them in a USB pen-drive and add a OPDS catalog to the mix. Make copies of the drive, and send them to the schools without connectivity. The latest version of Get Books would recognize the drive, and let the student browse through the collection, search for books, and add whatever she wants to the Sugar Journal. Once a book is in the Journal, it can be shared among all the students using the Journal object transfer support in Sugar, or via the Read Activity directly. So essentially, you get a Library on a Stick, with thousands of books, something which, till now, in its physical form, has been largely restricted to better equipped (and usually richer) schools.
Of course, even larger collections can be distributed if a School Server (XS) is present in the mix (due to the fact that the school server can have a larger disk in it), and support for this type of distribution method involving the XS would hopefully appear within the next few releases of Get Books.

Books, Sugar and OLPC

Articles and posts like this (and subsequently this) underline the need for a status report about ebook-reading in Sugar and in the XO laptops. For the past few months, apart from my usual duties, I have been working on the book-reading stack for OLPC and Sugar, and this may be viewed as a progress report of the things I have been doing.
I have been mostly working on the Read Activity in Sugar, which is supposed to do the most heavy lifting as far as book-reading goes – though there is also ReadEtexts by Jim Simmons, which primarily handles plain text files from Project Gutenberg (the latest version of ReadEtexts supports RTF files as well). Currently, the ebook formats that are supported in Sugar include

  • Epub
  • PDF
  • DJVU
  • Plain Text (specifically the format used by Project Gutenberg)
  • Postscript
  • CBZ
  • RTF

There also exists a sugar-ified FBReader, with support for more formats (such as plucker and non DRM’ed mobipocket).
With the last major release of Read (a part of Sugar 0.86), apart from the addition of Epub support, there has been usability improvements and tweaks (particularly for the full-screen mode), as well as support for bookmarks (notes can be associated with each bookmark).

For the next major release, I have started to work on support for highlighting text (at least in Epub files) and better usage of the XO “game-keys” in fullscreen mode (so that the overall experience in tablet mode of the XO laptops become smoother). Interestingly, highlighting text did not work out as I had planned, since the highlights became almost invisible in the grayscale reflective mode of the XO laptops. So instead of highlighting, Read would probably support underlining of text (when I was a kid, we often shared books, especially school books, and I was told it is always better to underline with a pencil than to use a marker pen to highlight ;-) .
Read Highlight
Of course, Read is only one part of the book-reading puzzle. There has to be a system in place for book acquisition as well (from the Internet as well as from a local schoolserver, if available). In a previous blog post, I mentioned Open Publication Distribution System, which is built upon the Atom syndication format to allow online book distributors to publish their catalog. I extended Jim Simmon’s Get Internet Archive Books activity to support OPDS, and now, apart from the Internet Archive, the preview version that I have can also retrieve books from Feedbooks. Here’s a video of the activity in action:

The next major step would be to implement a server side OPDS implementation in the School Server (XS), as well as some kind of caching mechanism to conserve bandwidth (if a copy of a book is found in the school server, it should be downloaded instead of the online version).
To keep up with the progress, you can either subscribe to the sugar-devel list or the more specialized (and low volume) olpc-bookreader list.

Braindump on ebooks

The inspiration for this post comes from a talk by Alan Kay, entitled Beyond the Printing Press: Computers as Learning Environments for All Children. You can view the video recording of the talk here


The development versions of Read Activity is now shipping with Epub support. This makes me excited for quite a few reason. Of course, the most obvious reason to get excited is the fast growth and adoption of Epub as a standard for e-books. However, there is more to it…
Books, once again (after Gutenberg’s time) are changing. Gutenberg brought in the transition from hand-written books to large-scale print – and now we see yet another shift, where books are transitioning from ink, paper and the printing press to bits stored inside a variety of devices. Towards the beginning of the printing press revolution, there was a strong desire and tendency to mimic the “old” format as much as possible, in terms of look and feel. Gutenberg and his associates even hand-drew illuminated decoration on the Gutenberg Bibles, to retain the similarity to the older, handwritten copies of the Bible. In what seems to be an almost eerie repetition, today, in the ebook, we see a strong desire to mimic the traditional book as much as possible. (eg ebook readers trying to retain the older “UI” paradigm, efforts to make ebooks retain the formatting niceties of traditional books, etc). This is not unusual, or wrong. We are used to the traditional book, and it is important to make the path to transition as smooth as possible.
However, what makes me really excited at this stage is something else. It is the potential new things we could do with Ebooks, things that would not have been possible with books in the old format. This weekend, I did some changes to a Epub file, and extended the Read Activity a bit to come up with a few such things:

  • Audio-visual content inside books: This is almost obvious – with the transition to books which are read on devices having audio/video capabilities, the next logical step is to embed these into books.



    (Video from the Internet Archive, text from Wikipedia)
  • An interactive shell inside a book: An interactive Python shell inside a book teaching Python, so that small examples and snippets can be tried out inside the book, right away.



    (Text from How to Think like a Computer Scientist, Python edition)
  • A full blown, interactive environment inside books: A book on digital logic can have a small sandboxing area, where readers could connect the various virtual components together, and see what happens.



    (Text from Wikipedia and the Lorem Ipsum generator, demo from the Etoys project)

Of course, this is just a proof of concept, and probably most Epub readers will simply ignore the interactive content part. Moreover, there may be security issues with such books as well (the idea of having a Python shell inside a book will make many nervous) – but I think this is where Bitfrost, and its software implementation, Rainbow (which is essentially an isolation shell) comes in.
There is another way of “interaction” which I have not covered in the above screencasts – and this is something which is already available in traditional ink and paper books, especially text-books. Ebooks need to support “exercises” like fill-in-the-blanks, multiple-choice-questions, etc. There is an urgent need to support this, and this should be done in a standardized way. The local storage standard associated with HTML5 seems to be a possible way forward, though probably there might be better ways to do this (especially if we want the ability to have teachers remotely check and evaluate exercises done on e-textbooks).

Read and Epub and beyond

For the past few weeks, I have been spending most of my time implementing Epub support for Sugar’s Read activity. Epub is gaining increasing acceptance, and a few weeks back, Project Gutenberg started distributing many of their material in the format, and Google + Sony also seem to have started to distribute a large chunk of public domain books as Epubs.

Today I finally reached the stage where the work could be tested on an actual XO, and here’s how it looks:
Read opening a Epub file on an XO

The rendering is done using WebkitGTK (the Python bindings) and I was a bit concerned about the possible performance issues on the XO-1 (which has a relatively ancient processor, slow filesystem access, only 256 MB of RAM and no swap). The biggest worry was the loading time – since it involves pre-rendering the entire book to gather metrics for pagination (most Epub books I have come across do not have clearly defined page-breaks, so that has to be figured out), but to my surprise (and relief) the load time turned out to be quite acceptable.

Right now, the viewer supports a very limited subset of the Epub standard (and works only with XHTML based Epubs), but so far it has managed to handle all the files I have tested it with. The viewer is a standalone widget used by the , which should make it possible reuse the work to develop a Epub reader for GNOME as well.

Once the Epub support in Read reaches an acceptable state, the plan is to start working on implementing support for the draft Open Publication Distribution System specs, which allows ebook distributors to distribute e-books via XML catalogues. It makes sense to support this in Read, as well as in the school server, to ease the e-books distribution process. For example, if we have a large e-book collection for a particular deployment, it may not make sense to put all of them in individual laptops – instead allowing the user to browse/search the catalogue and download the books as and when required would probably be a better option.

Why should I bother ?

Warning: This is a rant. Feel free to ignore

I love coding in Python, and in spite of some of the occasional issues it can cause, I feel that it lets one accomplish whatever one wants to do with the minimal amount of magic incantations. So naturally, I have been trying to convince my friends from college to try out Python, but after a few incidents I’m not so sure if I have been doing the right thing. Couple of events will explain the situation:

Scene I – Interview for positions in one of the “big four” Indian IT companies:
Friend of mine has Python listed under the skills sections in his CV
Interviewer: ওরে বাবা তুমি তো পাইথন জানো। (TRANS:Wow (in the sarcastic sense) – you seem to know Python)
Friend: হ্যাঁ (TRANS:Yes)
Interviewer: আচ্ছা Java জানো কি ? (TRANS:So, do you know Java)
Friend: যতটুকু কলেজে পড়িয়েছে, ওইটুকু, তার থেকে বেশী জানি নাহ (TRANS:Not much, just whatever they have taught in college) (the college course covers Java as an example of a Object Oriented language, so it does not go very deep)
Interviewer: আচ্ছা, এটা বল তো… (TRANS:All right then, answer this)
Interviewer: pretty convoluted question from Java – involving complicated API stuff and such
Friend: বলতে পারবো নাহ (TRANS:Sorry, I can’t answer this)
Interviewer: যা, এইটুকুই জানো না, আর পাইথন ফাইথন কী সব শিখে ফেলেছ ? (TRANS:Bah! You don’t know such basic stuff, and on the other hand, you have learn’t Python and whatnot!!)

Needless to say – the guy did not get selected, and got rid of Python from his CV.


Scene II – Yet another interview, this time for a “research” position in academia
Friend of mine has been learning PIL, PyGTK, etc and has Python listed in his CV
Interviewer: আচ্ছা, এই পাইথনটা কি ? (everyone in interview panel make weird facial expressions) (TRANS:So, what is this Python “thing”?)
Friend: <explains>
Interviewer: আচ্ছা এটার এরকম বিচ্ছিরি নাম কেন ? (TRANS:So, why does this have such a weird sounding name?)
Friend: <explains, mentioning Monty Python, etc>
Interviewer: দেখো, আমরা তো এসব জানিনা, আমরা সাবজেক্ট জানি। তুমি বরং কি সাবজেক্ট জানো বল (TRANS:Look, we do not know these things, we know “subjects”. What “subjects do you know ?)
<..and the interview continued with some very standard (and stupid, IMHO) questions (most of which, I believe are lifted from this particular book). My friend answered all of the questions, except for one.>


Friend later tells me: ওইরকম মুখ বানালো – ওই দেখেই বুঝলাম হবে নাহ্‌ । আমি আর কোথাও পাইথন জানি বলছি নাহ্‌ । (TRANS:From their expression on hearing the word Python, I knew I was not going to crack this interview. I’m not going to mention Python in any future interview.)


When the first incident happened, I thought it was a isolated case. But after the second one, I don’t think it is (and there has been at least one other similar case as well). In fact, when the campus recruitment started for our batch in college, a very senior and respected faculty member told me that my chances of getting placed from college was very slim. I did not appear for any of the recruitment programs (and almost got fined by the college authorities for being “absent”), so I did not get the chance to test out his theory – but that’s a different story altogether.
For the second incident, one may claim that the interviewers were perhaps looking for someone who had a good “theoretical understanding” or had “strong fundamentals”, but I have my doubts (primarily due to the generic crappy questions that were asked afterwards). The first incident on the other hand, points clearly towards something being very wrong with the interviewer.
The question that arises after all this is, why should I ask people to learn Python, or for that matter anything that is not covered by the officially sanctioned syllabus ? On one hand, our “progressive” political leaders and leaders of our various industries speak about nurturing and enhancing “talent” to build a better India, and what not. In the real world on the other hand, at the very ground level, the same institutions that the leaders are supposed to be the patrons and creators of, encourage nothing but mediocrity. End result: each year, thousands of bright young students get turned into zombies. What a terrible waste… what a terrible waste…

Minor update: I realize that many have mistakenly assume that the requirements in the first interview had something to do with Java. It did not. It was a fresher interview, conducted during campus placements, and the students were expected to have zero experience. Many of the students who were actually selected were either placed in testing, or in .Net (mostly building/maintaining/troubleshooting ASP.Net/C# sites)

Updates..

This blog has not seen much activity in a while, so here goes:

  • Bought a HCL touch-screen based netbook. It’s somewhat ancient hardware, but most of the stuff works out of the box (except for the webcam, which does not even show up in lshal or lsusb). The touchscreen required a binary driver – but a Free/Open Source version seems to exist, though I could not get to calibrate the screen with the FOSS driver variant
    [Update: The webcam works - I had to press Fn-F5 to enable it. It is turned off by default to conserve battery.]
  • Taught myself (this was long overdue – but at least now I can admit that I did not know what I used not to know) how to properly write Python extensions in C. I started out with bindings for Hunspell (I’m reading up a bit on morphology nowadays, and finding it to be tremendously entertaining). There was a Python extension for Hunspell already, but it did not compile for me, and that pushed me to decide to figure out how to do this myself. One thing led to another, and so, as of now, there is (in progress) extensions for handling:
    • Hunspell. Usage instructions here
    • libgettext-po. This should be faster than the existing pure Python based PO file parsers out there. (maybe at some point, I could make Pootle/Translate Toolkit use this, and make the work of OLPC/Sugarlabs translation team members somewhat less frustrating.
    • XKB. I must admit that I took a shortcut for this, and this extension is actually based on the awesome libxklavier. The final plan is to develop a Sugar extension for managing the keyboard options and layouts using this extension. The code in the main git repository, though fairly complete in terms of what is required for Sugar at the moment, is not implemented via (py)gobject. Implementing the pygobject-based wrapper is turning out to be a bit more complicated than I initially thought, but some code for that is also available in this repository (it is somewhat easier now, since I know (at least most of of) what is happening under the hood).
  • Released a newer version of the FBReader activity, which is much more improved in terms of usability (eg: response to the game keys keys while the XO-1 is in tablet mode is much more smoother, and all the keys do something useful). People seem to be happy with the new release.
  • Coming back to the present, right now, among other things, I’m working on a few interesting (and important) enhancements for the book-reader(s). Some of them include support for long keypresses (eg: pressing the “square” game key for two seconds will show the table of contents), notification of critical power events (I realized to my horror during dogfooding, that in tablet mode, while the book reader is open in full screen, there is no way to tell how much battery-charge is left), etc. The bookmark support feature that I came up with a few months back needs a bit of polish, but I think I can make this show up in the next release of Read.

19th March, 2009

  • Sucrose 0.84, the latest stable version of the Sugar educational platform has been released. A large number of bugfixes, new features, improvements and tweaks have gone in during the past six months, and to try out this release, you can use Sugar on a Stick.
  • Sugarlabs (the organisation which is currently driving the development of Sugar) is a mentor organization for Google Summer of Code 2009. A list of ideas is currently on the Sugarlabs wiki.
  • It looks like Google has decided to publish out-of-copyright books as ePub files for the Sony Reader. This is awesome news, all the more so since some of us have been working for the past few weeks to ensure that more ebook formats are supported in Sugar. As a part of that, I have created a Sugarized version of FBReader, which handles epub files superbly:
    FBReader Activity for Sugar

    FBReader Activity for Sugar

The GNOME.Asia Summit, 2009

We have been informally discussing the idea of hosting GNOME Asia 2009 in India this year, and a initial TODO list and a set of ideas has been just posted on the gnome-india mailing list.
We would need a lot of effort to make this happen successfully, and if you are interested in helping out, by all means, please jump in :-) . Join the mailing list, and start taking part in the discussion. There is also a set of pages on the wiki, please feel free to put in your thoughts and suggestions in those page as well.

16th February, 2009

  • Pootle migration: We are moving the OLPC/Sugarlabs Pootle instance to a newer dedicated server, which should speed it up considerably. This has also given me some opportunity to fine-tune and polish our l10n workflow – things should be a bit more easier and smoother (and faster) for translators. I also managed to gather some interesting data from the log and user registration files. It turns out that we have more than 1000 translators registered with the system, among whom about half have actively contributed translations in the past one year. I’m not sure what the user statistics for other Pootle installations are like, but it seems that we are one of the larger users of Pootle out there.
  • Read hacking: I have been also spending some time hacking on Read. While Mr Super Awesome Tomeu has been pushing our Evince patches upstream, I have been working on a few interesting features for Read (we have moved to Gitorious, which is so cool):
    • Support for books from the Universal Library: Many of the scanned childrens’ book from the Universal Library Project are too graphics heavy for the XO hardware to be handled in PDF form. However, it looks like the project also stores the book as zip files with each scanned page archived inside the zip file as individual jpegs – which in other words, is very similar to the comic book archive format which Evince (Read’s backend) supports quite nicely. More importantly, this format seems to have lesser performance issues on the XO hardware (compared to graphics heavy PDF files). So I have been making sure that Read also handles this format gracefully.
      Book from the Universal Library in Read
    • Bookmarks support:This has been one of the oft requested features for Read, apart from annotations. The original design specs for Read already provided me with ideas on how the UI should look like, so with some amount of coding, I have bookmark support which mostly works :-) . I am also trying to do the implementation in such a way so that it would be easy to add support for sharing of bookmarks later on in the future. If anyone is interested in doing a project, contact me (hint.. hint ;-) )
      Bookmarks in Read

    Code for the above lives in the sayamindu-sandbox branch of Read’s Git repository. I plan to take a stab at annotations during the next few weeks – I have some ideas which, with some luck, may work. I also have some plans about a saner full-screen/ebook mode for Read – let’s see if I get the time to implement those as well.

  • This came up in one of the mailing lists a few days back. Serves as a reminder as to why the work we all do is so relevant and so important.

l10n: More than one language

Falling back to English when translation of a particular string is not found is not always the best solution. As a practical example, our Aymara users would prefer that the fallback language be Spanish, and only if the Spanish translation is not found, English should be shown.

I was wondering how to implement this for Sugar and its activities, and I realized that something like this is already implemented in Python’s gettext implementation. So after some changes to Sugar, I had the following:

In the screenshot, the Restart Game pop-up is not translated into Aymara, and so it shows up in Spanish as Reiniciar Juego, while the rest of the strings are in Aymara.

Of course, there is a lot more to be done – the Sugar control panel language selector needs to be changed to allow selection and ordering of multiple languages, and currently this works for activities, core Sugar needs to support this feature as well.

14th October, 2008

  • There might be a Barcamp Kolkata soon:
    Barcamp Kolkata Logo
  • Got Table of Content support working in Read Activity
    ToC Support in Read
  • Wrote a small PDF viewer tool with support for the Journal which is then used by mozplugger to show PDF files within Browse. (You can put the file in your journal if you like it)
    PDF inside Browse
  • Infoslicer is awesome. Here’s a Youtube video demo of it.


OLPC Stamps

Just noticed this:

OLPC Uruguay Stamp
Creative Commons License photo credit: Wayan Vota

Weekend hacks

Over the past few weekends, I have been working on a few (semi)hobby projects.

  • Conversion of XKB data to M17N tables
    I discovered pyparsing while working on this. The tool I wrote is supposed to extract the data out of XKB symbol files, and convert them into a format which can be easily modified into M17N db files. In fact, for some keyboard layouts, the output was directly usable in m17n (via SCIM), without any kind of direct modification at all.
    The only problem with the script is that the parsing of the XKB symbol files take a significant amount of time, but in the end, it does provide something useful. [Gitweb]
  • An image viewer activity for Sugar
    Sugar did not have a nice Image Viewer activity which I liked, so over the weekend, I hacked together a small activity which would perform the basic stuff expected of an image viewer (zoom, rotation, etc). [Gitweb]

Quote of the week

From this week’s community news:

Among them was the mayor of South Beirut, with whom I spoke. ‘The American government sends bombs to kill the innocent,’ he said, ‘and the American people send us computers for our children. We are very grateful to OLPC. This means opening up the world to our children.’



Also, in related news, Sugar is being translated into Aymara. If you can help in this effort, or for that matter, any of the translation efforts, you are more than welcome to jump in ;-) .