Books, Sugar and OLPC

Articles and posts like this (and subsequently this) underline the need for a status report about ebook-reading in Sugar and in the XO laptops. For the past few months, apart from my usual duties, I have been working on the book-reading stack for OLPC and Sugar, and this may be viewed as a progress report of the things I have been doing.
I have been mostly working on the Read Activity in Sugar, which is supposed to do the most heavy lifting as far as book-reading goes – though there is also ReadEtexts by Jim Simmons, which primarily handles plain text files from Project Gutenberg (the latest version of ReadEtexts supports RTF files as well). Currently, the ebook formats that are supported in Sugar include

  • Epub
  • PDF
  • DJVU
  • Plain Text (specifically the format used by Project Gutenberg)
  • Postscript
  • CBZ
  • RTF

There also exists a sugar-ified FBReader, with support for more formats (such as plucker and non DRM’ed mobipocket).
With the last major release of Read (a part of Sugar 0.86), apart from the addition of Epub support, there has been usability improvements and tweaks (particularly for the full-screen mode), as well as support for bookmarks (notes can be associated with each bookmark).

For the next major release, I have started to work on support for highlighting text (at least in Epub files) and better usage of the XO “game-keys” in fullscreen mode (so that the overall experience in tablet mode of the XO laptops become smoother). Interestingly, highlighting text did not work out as I had planned, since the highlights became almost invisible in the grayscale reflective mode of the XO laptops. So instead of highlighting, Read would probably support underlining of text (when I was a kid, we often shared books, especially school books, and I was told it is always better to underline with a pencil than to use a marker pen to highlight ;-) .
Read Highlight
Of course, Read is only one part of the book-reading puzzle. There has to be a system in place for book acquisition as well (from the Internet as well as from a local schoolserver, if available). In a previous blog post, I mentioned Open Publication Distribution System, which is built upon the Atom syndication format to allow online book distributors to publish their catalog. I extended Jim Simmon’s Get Internet Archive Books activity to support OPDS, and now, apart from the Internet Archive, the preview version that I have can also retrieve books from Feedbooks. Here’s a video of the activity in action:

The next major step would be to implement a server side OPDS implementation in the School Server (XS), as well as some kind of caching mechanism to conserve bandwidth (if a copy of a book is found in the school server, it should be downloaded instead of the online version).
To keep up with the progress, you can either subscribe to the sugar-devel list or the more specialized (and low volume) olpc-bookreader list.

Read and Epub and beyond

For the past few weeks, I have been spending most of my time implementing Epub support for Sugar’s Read activity. Epub is gaining increasing acceptance, and a few weeks back, Project Gutenberg started distributing many of their material in the format, and Google + Sony also seem to have started to distribute a large chunk of public domain books as Epubs.

Today I finally reached the stage where the work could be tested on an actual XO, and here’s how it looks:
Read opening a Epub file on an XO

The rendering is done using WebkitGTK (the Python bindings) and I was a bit concerned about the possible performance issues on the XO-1 (which has a relatively ancient processor, slow filesystem access, only 256 MB of RAM and no swap). The biggest worry was the loading time – since it involves pre-rendering the entire book to gather metrics for pagination (most Epub books I have come across do not have clearly defined page-breaks, so that has to be figured out), but to my surprise (and relief) the load time turned out to be quite acceptable.

Right now, the viewer supports a very limited subset of the Epub standard (and works only with XHTML based Epubs), but so far it has managed to handle all the files I have tested it with. The viewer is a standalone widget used by the , which should make it possible reuse the work to develop a Epub reader for GNOME as well.

Once the Epub support in Read reaches an acceptable state, the plan is to start working on implementing support for the draft Open Publication Distribution System specs, which allows ebook distributors to distribute e-books via XML catalogues. It makes sense to support this in Read, as well as in the school server, to ease the e-books distribution process. For example, if we have a large e-book collection for a particular deployment, it may not make sense to put all of them in individual laptops – instead allowing the user to browse/search the catalogue and download the books as and when required would probably be a better option.