Typing in Linux (ibus)

If your keyboard is marked in Latin or Cyrillic letters, and you have to type texts in another language, especially using complex, non-alphabetic scripts, then this note on Linux input systems (simplified “keyboard layouts”) may interest you.

I apologize in advance for the fuzzy terminology and do not pretend to be an exhaustive technical description. The main objective of the article is a description of the possibilities, not implementation.

Input methods

The main input method (IM) in Linux is XKB, it is installed by default and is activated immediately after installing the operating system. XKB is designed to work with alphabetic scripts, and cannot serve complex scripts such as Chinese characters or syllabaris in India and Africa. The system can be configured to work with no more than 4 layouts. The last limitation can be circumvented by placing a command call with the necessary combination of parameters for each language on the hot keys.

If you need more flexibility, you should go to the framework (input method framework). The main representatives of such systems in Linux: IBus, SCIM, Fcitx. The framework itself does not know how to enter text, and various scripts must be connected in the form of plug-ins (engines). From experience using IBus and Fcitx, I can say that both systems support approximately the same number of plugins. Often, these can be almost the same plugins. For example, the Chinese Pinyin input method is implemented as a standalone libpinyin library and when connected via IBus or Fcitx provides identical capabilities.

We can assume that over the past 6-7 years, the difference between the frameworks has been leveled, although some features may appear. Next, I will list the main IBus plugins as a more familiar system to me.

Firstly, IBus is able to transparently use xkb and all its features. The only problem is that IBus cannot dynamically generate XKB configurations. The most popular of them are pre-registered in a file /usr/share/ibus/component/simple.xml, which can be changed and supplemented as necessary. (When updating IBus, the file will be replaced with the standard one.)

For example, the Russian layout is described as follows:

xkb:ru::rusruGPLPeng Huang <shawn.p.huang@gmail.com>ruRussianRussianibus-keyboard99

In addition to, layoutyou can specify layout_variantother parameters are setxkbmapnot available, including the well-known typographic layout of Ilya Birman, which is specified in xkb through an argument misc:typo. To get around this restriction or just create a layout for your tasks, you need to fully describe it. To do this, /usr/share/X11/xkb/symbolsyou need to create a file in the folder custom(if you supplement the existing files, then they will be overwritten when updating the system) and configure the layout. For example, Russian with the additions of Ilya Birman:

partial alphanumeric_keys
xkb_symbols "ru-typo" {
    include "ru(winkeys)"
    include "typo(base)"
    include "level3(ralt_switch)"
    // 1th keyboard row
    key  { [ NoSymbol, NoSymbol, U0301, NoSymbol ] };  // "~"

Where lines includecollect configuration from ready-made templates. Accordingly, a variant of the Russian layout "winkeys" is taken from the file "ru". Then it is supplemented with the "base" layout from the "typo" file and the switch of the third layer AltGr is set (see the file "level3"), which is similar to the command:

setxkbmap -layout ru -variant winkeys -option lv3:ralt_switch,misc:typo

If desired, you can make your own changes. In the above example, the accent mark "U + 0301" (Combining Acute Accent) is rendered for AltGr + ~. The indicated positions NoSymboluse the definitions from the previous patterns: "ё" and "ё" from "winkeys", "≈" from "typo":

key  { [ Cyrillic_io, Cyrillic_IO, NoSymbol, NoSymbol ] };  // winkeys
key  { [    NoSymbol,    NoSymbol, NoSymbol, approxeq ] };  // typo
key  { [    NoSymbol,    NoSymbol,    U0301, NoSymbol ] };  // custom

Next, the created layout must be added to the file /usr/share/ibus/component/simple.xmlin the following form:

xkb:ru:typo:rusrucustom,usru-typo,Russian (with Typo)Russian (with Typo)ibus-keyboard1

Where custom- the name of the file from the folder /usr/share/X11/xkb/symbols, and ru-typoindicates the layout contained in it. An additional layout usis indicated so that the hot keys work correctly (Ctrl + C, Ctrl + V, etc.). After rebooting IBus ( ibus restart), a new layout "Russian (with Typo)" will appear in the settings.

The second input method is m17n . This is a fairly rich library of keyboard layouts for a variety of scripts. IBus has its own similar ibus -table input method , which is described as having "slightly lesser capabilities." I had to use the latter to create a layout with a unique correspondence between the Latin letters and the letters of the required alphabet without involving complicated logic, so I can not judge which of the two systems is more functional and expressive - the description of the layout in m17n or ibus-table format. The ibus-table method includes a curious "LaTeX" layout for entering characters in the corresponding notation: " \Delta" for "Δ", " \ge" for "≥", etc.

The next universal input method is KMFL . This is the Linux version of Keyman's input method for Windows. Not a very common IM that supports the rarest scripts. Unlike the original Keyman, with the stated ability to print on more than 1000 scripts, KMFL is not so developed, but it can also be useful. The layout description format is text, there is a program for creating them under Ms Windows. I use the EuroLatin layout, in which the text " 2//3" is converted to the fraction "⅔", and the sequence " -a" is converted to the macron "ā". Reminds Compose key in xkb, but does not require a separate modifier - KMFL itself recognizes the sequence during dialing.

Other input methods specialize in separate scripts: "ibus-libpinyin" for Chinese, "ibus-unikey" for Vietnamese, etc. The settings for these plugins are also located at /usr/share/ibus/component/. In the corresponding files, you may need to set the basic keyboard layout, otherwise when switching from a non-Latin layout they will be inoperative. For example, libpinyin.xmlyou need to find the "layout" parameter and enter "us" for the QWERTY keyboard or "fr" for AZERTY, etc.


Switch layouts

Most of the time I work with language pairs: Russian-English, Chinese-Spanish, etc. Therefore, I prefer to have one hotkey to switch between the last two layouts (CapsLock), and the layouts themselves are switched by separate hot keys (Win + 1 ... 9 on the digital block). Thus, first I set the working layouts, Win + 1 (en) and Win + 2 (ru), and then switch between them by CapsLock (en <-> ru).

In IBus, you can specify two hot keys: one for cyclic switching over the list of layouts, the second for the last two layouts. You can also select the desired layout through the console and, accordingly, assign a script to a hot key.

I note that CapsLock can xmodmapnot be reassigned with help , since IBus resets such settings. Therefore, I prefer to udevglobally redefine CapsLock as F14 (file /etc/udev/hwdb.d/90-custom-keyboard.hwdb):

evdev:input:b0003v1A2Cp0E24* # my keyboard id
 KEYBOARD_KEY_70039=f14 # bind capslock to f14

And use already F14 as a hotkey in IBus. In my experience, this provides the most stable configuration.

For more information on configuring udev, see the end of the article.

Virtual keyboard

Keyboards marked up for a certain writing system are industrially produced only for languages ​​with a large number of users - for example, for Russian (YTsUKEN). Neither in Armenia nor in Georgia can you buy a keyboard with keys signed with letters of national alphabets. Similarly, in Kazakhstan and Uzbekistan they use Russian-English keyboards and are forced to learn where letters that are not in the standard Latin or Cyrillic alphabet are located.

If you are mastering a new layout, I advise you to use the virtual keyboard. I like Onboard , because it adapts itself to the active layout and is updated when switching to another. But this only works with xkb (also when using xkb via IBus).

Onboard is very convenient for testing xkb layouts and allows you to see the assigned symbols on all layers (AltGr, etc.).


Not all programs correctly support language frameworks. In particular, Sublime Text 3 works only with SCIM, and using IBus, regardless of the layout selected, it will print exclusively Latin letters.

I have been using IBus for quite some time, and I know other systems very superficially. According to reviews on the Internet, Fctix is ​​described as more functional and better adapted for entering Chinese text. In any case, when working with Chinese texts IBus completely suits me and the differences should be unprincipled. The last time I had to use Fctix (2 years ago), this framework did not allow switching layouts if the cursor is not in the text field. I hope that by now this bug has been fixed.

Another help for working with a variety of scripts is silicone keyboard overlays. Chinese online markets offer overlays (保护 膜 or 键盘 膜) for the Apple Magic Keyboard for a wide variety of scripts. An example of a non-Chinese distributor . But keep in mind that three generations of Apple Magic were produced (and each in versions for the USA, Europe and Japan), and Chinese replicas differ in linear dimensions and arrangement of keys. At times, I regret that there is no single standard for computer keyboards.

Quick reference for converting a keypress signal

The digital code of the pressed key changes its value several times.

  1. scancode: When a key is pressed, the keyboard (or driver?) sends scancode to the Linux kernel .
  2. keycode: Next, in the kernel, scancode is converted to keycode (Linux input API subsystem). Transformation can be controlled using the programs udev , keyfuzz , setkeycodes .
  3. keysym: the X the Window of the System receives core keycode and translates it into a keysym - it is the ultimate symbol that the client program will receive as input. The conversion is configured via XKB or xmodmap (deprecated).

It is seen from the reduced sequence that remapping the keyboard in step scancode > keycode preferable because it causes no intersections with KXB.

Udev setup instructions

Broadcast scancode to keycode is produced for each input device independently, so first you want to know the unique identifier of the keyboard (actually evdev also works with a large class of peripheral devices with a button - from mice to printers and webcams). Arch Linux users can use the following script (for other distributions, you may need to adjust the paths):

for DEVICE in /dev/input/by-id/*; do
    echo $(basename $DEVICE)
    DEVID=$(basename $(readlink $DEVICE))
    printf "evdev:input:b%sv%sp%se%s*\n\n" \
        `cat /sys/class/input/$DEVID/device/id/bustype` \
        `cat /sys/class/input/$DEVID/device/id/vendor` \
        `cat /sys/class/input/$DEVID/device/id/product` \
        `cat /sys/class/input/$DEVID/device/id/version`

The same device can be represented in the system in several instances under different names, but the identifier will be the same. For example, my keyboard is defined as two devices:


Примечание: идентификатор можно сокращать (например, до b0003v1a2cp0e24*), что бывает полезно при создании единых правил для серии однотипных моделей. Звёздочка “*” здесь играет роль символа подстановки (wildcard).

Теперь нужно создать файл 90-custom-keyboard.hwdb в /etc/udev/hwdb.d/ со следующим содержанием (образцы см. в /usr/lib/udev/hwdb.d/60-keyboard.hwdb):

evdev:input:b0003v5c0ap0003e0110* # ваш идентификатор
 KEYBOARD_KEY_70039=f14           # переназначение клавиши

Строка KEYBOARD_KEY начинается с пробела, это важно. Обновите конфигурацию:

sudo udevadm hwdb --update && udevadm trigger

В последующем, при перезагрузке или переподключении устройства конфигурация будет обновляться автоматически.

Переназначение клавиш задаётся парами KEYBOARD_KEY_=. Значения keycode (обязательно в нижнем регистре) находятся в /usr/include/linux/input-event-codes.h (для Ubuntu 14.04 в /usr/include/linux/input.h).

You can get scancode using the evtest program. First, you need to determine the eventXX number, to do this, run the command and find your keyboard:

> ls -l /dev/input/by-id/
usb-SEM_USB_Keyboard-event-if01 -> ../event11
usb-SEM_USB_Keyboard-event-kbd -> ../event10

Select "Keyboard-event-kbd" and find out the desired number (in this example, 10). Now you can turn to evtest:

> sudo evtest /dev/input/event10
Event: time 1531562530.720076, type 4 (EV_MSC), code 4 (MSC_SCAN), value 70039

When you press the "CapsLock" key, the code "70039" is obtained - this is the desired scancode .

Also popular now: