Saturday, October 23, 2010

Typing mathematical characters in X

Unicode provides many useful characters for mathematics. If you've studied the traditional notation, an expression like « Γ ⊢ Λt.λ(x:t).x : ∀t. t → t » is much more readable than an ASCII equivalent. However, most systems don't provide an easy way to enter these characters.

The compose key feature of X Windows provides a nice solution on Linux and other UNIX systems. Compose combinations are easy-to-remember mnemonics, like -> for →, and an enormous number of characters are available with just a few keystrokes.

Setting it up

I cooked up a config file with my most-used mathematical symbols. With recent Xorg, you can drop this file in ~/.XCompose, restart X, and you should be good to go.

The include line will pull in your system-wide configuration, e.g. /usr/share/X11/locale/en_US.UTF-8/Compose. This already contains many useful characters. I was going to add <3 for ♥ and CCCP for ☭, but I found that Debian already provides these.

GTK has its own input handling. To make it defer to X, I had to add an environment variable in ~/.xsession:

export GTK_IM_MODULE="xim"

The "Fn" key on recent ThinkPads makes a good compose key. It normally acts as a modifier key in hardware, but will send a keycode to X when pressed and released by itself. I used this xmodmap setting:

keycode 151=Multi_key

Tweaking the codes

Being boilerplate-averse, I specified the key combinations in a compact format which is processed by this Haskell script.

Obviously, not everyone will like my choice of key combinations. If you tweak the file and come up with something particularly nice, I'd like to see it. If you can't run Haskell code for whatever reason, it's not too hard to edit the generated XCompose file.

Though my use of Haskell here may seem gratuitous, I actually started writing this script in Python, but ran into trouble with Python 2's inconsistent treatment of Unicode text. Using Haskell's String type with GHC ≥ 6.12 will Just Work, at least until you care about performance.

Alternatives

If you don't like this solution, SCIM provides an input mode which uses LaTeX codes.

3 comments:

  1. When I want to write Unicode characters in text files, I'm very fond of emacs Tex input-method. It can be activated with `M-x set-input-method` or `C-x Enter C-\`, and then by selecting `TeX`.

    It allows to write UTF8 characters by using their TeX name : for example typing `\lambda` in an emacs buffer with the TeX input method will produce the `λ` character. Not all character are accessible (all UTF8 characters can be inserted by their unicode name with `M-x ucs-inert`), but the coherent TeX naming is very nice.

    I have used it to write simple text files about type systems that have near-LaTeX readability.

    ReplyDelete
  2. Another satisfied Compose-C-C-C-P customer. :D

    Great post. I've been meaning to write something like this up. I've also been meaning to look into better synchronizing Gtk+ and X's ideas of the compose table, but it's hairy and fraught with backwards-compatibility woes. I think it would also be neat to have a graphical compose sequence viewer, maybe integrated into the Gnome character map. (So, you hunt for a symbol, find it, and there's a hint saying “to type this character, press …” or an “add a shortcut for this character…” as appropriate.)

    ReplyDelete
  3. I suppose this isn't very useful to non-CJK speakers, but... on Windows, you can achieve the same thing using the IMEs (eg. the Japanese IME). FYI.

    ReplyDelete