User:Alissa Sabre/Pango adaptation in SL viewer
This is a memo on the pango adaptation in SL viewer, that I'm currently working on as of Sep. 2008.
What is pango?
Pango is an open source library to draw (render) texts in various languages/scripts. It helps programs to handle worldwide written languages.
Pango is used in various open source projects. Examples include GTK, whose components draw texts through pango.
Pango is not a font (character) rendering system; in fact, it requires some font rendering backend, e.g. FreeType. A problem of the font rendering engine such as FreeType is that they primarily renders character by character basis and minimally takes care about the placement of characters. They usually let application programs handle complex character layout issues. (You will see examples of simple vs complex character layout later in this memo.) Pango just does it.
An example is the folding behaviour of a long paragraph into lines. FreeType gives application programs no hint when a line should be interrupted. In the current SL viewer, it is handled by the viewer code itself. The problem is, the viewer simply looks for a space character. In English texts writen using Latin alphabets, that algorithm works fine, but it may not on other languages. Thai or Japanese, for example, rarely use space characters in sentences and there are totally different folding rules. Pango implementation knows such writing system dependent rules and hides the details from application programs.
Another example is bi-di handling. As well known, languages like Arabic or Hebrew are written right to left. Less knows thing is that modern Arabic and Hebrew texts often contain foreign words or phrase, e.g., a word "Second Life" as English words in them, and the English part is written left to right within a sentences written right to left. The process to find correct ordering of characters in this situation is called bidirectional (or bi-di for short) algorithm. It is well defined and anybody can write the code, but it is somewhat complicated and not easy to do. Pango contains an implementation of the bi-di algorithm and, again, hides the details from application programs.
Basic idea of adaptation
Linden SL viewer currently uses FreeType to render (to convert an outline data in TrueType font file into a bitmap image) a character, and lays them out on the screen using its own simple, English-oriented algorithm. SL viewer's text handling is based on Unicode, and almost all characters required to write world wide languages should be available (assuming you have necessary font files.) However, due to SL viewer's text layout algorithm, some non-English texts are shown in strange ways. It makes use of those languages in SL uncomfortable in some cases and impossible in some other cases.
Replacing SL viewer's own text layout algorithms with pango improves the viewer's text layout features, allowing support for more languages.
Examples of the problems and possible improvements by pango
We are discussing here some examples of problematic text behaviour of the current SL viewer and how they will be improved by using pango.
Various accents (aka combining marks)
Line folding
Right-to-left scripts
Cursiveness and dynamic ligatures
Implementation plans
At this moment I plan to adapt pango in SL viewer in three phases.
Phase 1: Rewriting of LLFont/LLFontGL
LLFont, LLFontGL, and their support classes provides basic text drawing function for other parts of SL viewer. SL viewer's own text layout algorithm is implemented here. In phase 1, we will rewrite LLFont and LLFontGL using pango, keeping external interfaces of the classes compatible with the current implementation. Other parts of the SL viewer sources than llrender/{llfont,llfontgl}.{h,cpp} will remain unmodified.
Many of the display problems occurring with chat log or (read only) note cards should be solved during the phase 1. However, the features improved in this phase is limited. In particular, it doesn't improve input-related problems. For example, typing of Hebrew or Hindi languages from keyboard will look very strange in this phase. The text selection (by mouse dragging on the text) may not work well.
Although the performance is not a goal of phase 1, we may implement some simple glyph caching mechanism during the phase 1, if pango made the viewer too slow.
Phase 2: Rewriting of input handling
Current LLFont/LLFontGL has very little to do with input. In phase two, we will add some new member functions (methods) to LLFontGL to expose some of the input related pango features and rewrite some of the UI components to use the new functions, so that the typing of some complex scripts and/or text selection to work fine.
We may also need to change parts of the LLTextEditor's internal data structures to facilitate bi-di texts.
Public interfaces of the UI components will be unchanged. (It may be necessary to make some small changes to existing interfaces, though.)
The UI components to be modified include LLTextEditor and LLLineEditor. LLViewerTextEditor may also be modofied, since it is tightly coupled with the internals of LLTextEditor.
Changes to other parts of the viewer source will be minimum.
Phase 3: Rewriting of LLUI text components
During phase 1 and phase 2, we will keep as much interfaces between classes unchanged as possible. It is to make the modification done in a short period of time, and keep the stability of the viewer. THe resulting code will be redundant and inefficient. In phase 3, we will add new interfaces to LLFont/LLFontGL and rewrite many UI components improve the performance. This process can be done slowly, part-by-part. Because the new LLFont/LLFontGL during phase 3 provides both older interfaces and new interfaces, old and new UI components can be mixed in the viewer code base.
We will add some high-level glyph caching mechanism in phase 3.
Other considerations
Pango compatibility with SL viewer
Pango supports standard platforms including Windows, MacOS X, and Linux. Pango itself is written in the standard C.
Pango uses Unicode (UTF-8) for text encoding.
Pango supports several font rendering backends. One of the supported backends is FreeType 2, that the current SL viewer uses as the font rendering engine.
Pango uses GTK (glib/gobject) type system. It requires some glue codes to be integrated into SL viewer code base, as GTK currently does (in LLWindowSDL implementation.)
License and development
Pango is distributed under LGPL. On standard platforms including Windows, MacOS X, and Linux, pango is usually compiled into a set of dynamic link libraries. There should be no license problem to use pango with SL viewer.
Pango is developed and maintained by a group that is a part of GTK community. The group seems very strong. The API seems stable; it got no incompatible changes for years, although there have been additions of new APIs.
Possible issues and risks
Pango program is somewhat large. Pango workspace is also large. As long as a user only requires English with ASCII 95 characters support, and he/she doesn't care about other languages nor extra characters, SL viewer's use of pango may be considered as an waste of the computing resource.
Pango arranges characters in a slightly different ways than the current SL viewer practices. Pango based SL viewer may break some resident created contents even if it uses English/ASCII only, e.g., a text label put in a very tight space on a script-created panel may appear without its last character due to the increased width for the glyph images of the label, or, in case the new text occupies narrower space than today, the label may show at the end an extra character that was added as a mistake of the script and has been hidden for a long time.
Pango is slower than the current viewer's simple layout algorithm. We need more complicated glyph caching mechanism than today because we can't cache glyphs per character bases (a glyph for a character depends on the context.) It may significantly lower the viewer's frame rate. Or, we may need more texture buffer on the graphics hardware than today and may raise the viewer's minimum hardware requirements.