Comprehensive keyboard handling in terminals
There are various problems with the current state of keyboard handling in terminals. They include:
-
No way to use modifiers other than
ctrl
andalt
-
No way to reliably use multiple modifier keys, other than,
shift+alt
andctrl+alt
. -
Many of the existing escape codes used to encode these events are ambiguous with different key presses mapping to the same escape code.
-
No way to handle different types of keyboard events, such as press, release or repeat
-
No reliable way to distinguish single
Esc
key presses from the start of a escape sequence. Currently, client programs use fragile timing related hacks for this, leading to bugs, for example: neovim #2035.
To solve these issues and others, kitty has created a new keyboard protocol, that is backward compatible but allows applications to opt-in to support more advanced usages. The protocol is based on initial work in fixterms, however, it corrects various issues in that proposal, listed at the bottom of this document. For public discussion of this spec, see #3248.
You can see this protocol with all enhancements in action by running:
kitty +kitten show_key -m kitty
inside the kitty terminal to report key events.
In addition to kitty, this protocol is also implemented in:
Quickstart
If you are an application or library developer just interested in using this protocol to make keyboard handling simpler and more robust in your application, without too many changes, do the following:
-
Emit the escape code
CSI > 1 u
at application startup if using the main screen or when entering alternate screen mode, if using the alternate screen. -
All key events will now be sent in only a few forms to your application, that are easy to parse unambiguously.
-
Emit the escape sequence
CSI < u
at application exit if using the main screen or just before leaving alternate screen mode if using the alternate screen, to restore whatever the keyboard mode was before step 1.
Key events will all be delivered to your application either as plain UTF-8 text, or using the following escape codes, for those keys that do not produce text (CSI
is the bytes 0x1b 0x5b
):
CSI number ; modifiers [u~] CSI 1; modifiers [ABCDEFHPQS] 0x0d - for the Enter key 0x7f or 0x08 - for Backspace 0x09 - for Tab
The number
in the first form above will be either the Unicode codepoint for a key, such as 97
for the a key, or one of the numbers from the Functional key definitions table below. The modifiers
optional parameter encodes any modifiers pressed for the key event. The encoding is described in the Modifiers section.
The second form is used for a few functional keys, such as the Home, End, Arrow keys and F1 … F4, they are enumerated in the Functional key definitions table below. Note that if no modifiers are present the parameters are omitted entirely giving an escape code of the form CSI [ABCDEFHPQS]
.
If you want support for more advanced features such as repeat and release events, alternate keys for shortcut matching et cetera, these can be turned on using Progressive enhancement as documented in the rest of this specification.
An overview
Key events are divided into two types, those that produce text and those that do not. When a key event produces text, the text is sent directly as UTF-8 encoded bytes. This is safe as UTF-8 contains no C0 control codes. When the key event does not have text, the key event is encoded as an escape code. In legacy compatibility mode (the default) this uses legacy escape codes, so old terminal applications continue to work. Key events that could not be represented in legacy mode are encoded using a CSI u
escape code, that most terminal programs should just ignore. For more advanced features, such as release/repeat reporting etc., applications can tell the terminal they want this information by sending an escape code to progressively enhance the data reported for key events.
The central escape code used to encode key events is:
CSI unicode-key-code:alternate-key-codes ; modifiers:event-type ; text-as-codepoints u
Spaces in the above definition are present for clarity and should be ignored. CSI
is the bytes 0x1b 0x5b
. All parameters are decimal numbers. Fields are separated by the semi-colon and sub-fields by the colon. Only the unicode-key-code
field is mandatory, everything else is optional. The escape code is terminated by the u
character (the byte 0x75
).
Key codes
The unicode-key-code
above is the Unicode codepoint representing the key, as a decimal number. For example, the A key is represented as 97
which is the unicode code for lowercase a
. Note that the codepoint used is always the lower-case (or more technically, un-shifted) version of the key. If the user presses, for example, ctrl+shift+a the escape code would be CSI 97;modifiers u
. It must not be CSI 65; modifiers u
.
If alternate key reporting is requested by the program running in the terminal, the terminal can send two additional Unicode codepoints, the shifted key and base layout key, separated by colons. The shifted key is simply the upper-case version of unicode-codepoint
, or more technically, the shifted version. So a becomes A and so on, based on the current keyboard layout. This is needed to be able to match against a shortcut such as ctrl+plus which depending on the type of keyboard could be either ctrl+shift+equal or ctrl+plus. Note that the shifted key must be present only if shift is also present in the modifiers.
The base layout key is the key corresponding to the physical key in the standard PC-101 key layout. So for example, if the user is using a Cyrillic keyboard with a Cyrillic keyboard layout pressing the ctrl+С key will be ctrl+c in the standard layout. So the terminal should send the base layout key as 99
corresponding to the c
key.
If only one alternate key is present, it is the shifted key if the terminal wants to send only a base layout key but no shifted key, it must use an empty sub-field for the shifted key, like this:
CSI unicode-key-code::base-layout-key
Modifiers
This protocol supports six modifier keys, shift, alt, ctrl, super, hyper, meta, num_lock and caps_lock. Here super is either the Windows/Linux key or the command key on mac keyboards. The alt key is the option key on mac keyboards. hyper and meta are typically present only on X11/Wayland based systems with special XKB rules. Modifiers are encoded as a bit field with:
shift 0b1 (1) alt 0b10 (2) ctrl 0b100 (4) super 0b1000 (8) hyper 0b10000 (16) meta 0b100000 (32) caps_lock 0b1000000 (64) num_lock 0b10000000 (128)
In the escape code, the modifier value is encoded as a decimal number which is 1 + actual modifiers
. So to represent shift only, the value would be 1 + 1 = 2
, to represent ctrl+shift the value would be 1 + 0b101 = 6
and so on. If the modifier field is not present in the escape code, its default value is 1
which means no modifiers.
Event types
There are three key event types: press, repeat and release
. They are reported (if requested 0b10
) as a sub-field of the modifiers field (separated by a colon). If no modifiers are present, the modifiers field must have the value 1
and the event type sub-field the type of event. The press
event type has value 1
and is the default if no event type sub field is present. The repeat
type is 2
and the release
type is 3
. So for example:
CSI key-code # this is a press event CSI key-code;modifier # this is a press event CSI key-code;modifier:1 # this is a press event CSI key-code;modifier:2 # this is a repeat event CSI key-code;modifier:3 # this is a release event
Note
Key events that result in text are reported as plain UTF-8 text, so events are not supported for them, unless the application requests key report mode, see below.
Text as code points
The terminal can optionally send the text associated with key events as a sequence of Unicode code points. This behavior is opt-in by the progressive enhancement mechanism described below. Some examples:
shift+a -> CSI 97 ; 2 ; 65 u # The text 'A' is reported as 65 option+a -> CSI 97 ; ; 229 u # The text 'å' is reported as 229
If multiple code points are present, they must be separated by colons. If no known key is associated with the text the key number 0
must be used.
Non-Unicode keys
There are many keys that don’t correspond to letters from human languages, and thus aren’t represented in Unicode. Think of functional keys, such as Escape, Play, Pause, F1, Home, etc. These are encoded using Unicode code points from the Private Use Area (57344 - 63743
). The mapping of key names to code points for these keys is in the Functional key definition table below.
Progressive enhancement
While, in theory, every key event could be completely represented by this protocol and all would be hunk-dory, in reality there is a vast universe of existing terminal programs that expect legacy control codes for key events and that are not likely to ever be updated. To support these, in default mode, the terminal will emit legacy escape codes for compatibility. If a terminal program wants more robust key handling, it can request it from the terminal, via the mechanism described here. Each enhancement is described in detail below. The escape code for requesting enhancements is:
Here flags
is a decimal encoded integer to specify a set of bit-flags. The meanings of the flags are given below. The second, mode
parameter is optional (defaulting to 1
) and specifies how the flags are applied. The value 1
means all set bits are set and all unset bits are reset. The value 2
means all set bits are set, unset bits are left unchanged. The value 3
means all set bits are reset, unset bits are left unchanged.
The program running in the terminal can query the terminal for the current values of the flags by sending:
The terminal will reply with:
The program can also push/pop the current flags onto a stack in the terminal with:
CSI > flags u # for push, if flags ommitted default to zero CSI < number u # to pop number entries, defaulting to 1 if unspecified
Terminals should limit the size of the stack as appropriate, to prevent Denial-of-Service attacks. Terminals must maintain separate stacks for the main and alternate screens. If a pop request is received that empties the stack, all flags are reset. If a push request is received and the stack is full, the oldest entry from the stack must be evicted.
Note
The main and alternate screens in the terminal emulator must maintain their own, independent, keyboard mode stacks. This is so that a program that uses the alternate screen such as an editor, can change the keyboard mode in the alternate screen only, without affecting the mode in the main screen or even knowing what that mode is. Without this, and if no stack is implemented for keyboard modes (such as in some legacy terminal emulators) the editor would have to somehow know what the keyboard mode of the main screen is and restore to that mode on exit.
Disambiguate escape codes
This type of progressive enhancement (0b1
) fixes the problem of some legacy key press encodings overlapping with other control codes. For instance, pressing the Esc key generates the byte 0x1b
which also is used to indicate the start of an escape code. Similarly pressing the key alt+[ will generate the bytes used for CSI control codes.
Turning on this flag will cause the terminal to report the Esc, alt+key, ctrl+key, ctrl+alt+key, shift+alt+key keys using CSI u
sequences instead of legacy ones. Here key is any ASCII key as described in Legacy text keys. Additionally, all keypad keys will be reported as separate keys with CSI u
encoding, using dedicated numbers from the table below.
With this flag turned on, all key events that do not generate text are represented in one of the following two forms:
CSI number; modifier u CSI 1; modifier [~ABCDEFHPQS]
This makes it very easy to parse key events in an application. In particular, ctrl+c will no longer generate the SIGINT
signal, but instead be delivered as a CSI u
escape code. This has the nice side effect of making it much easier to integrate into the application event loop. The only exceptions are the Enter, Tab and Backspace keys which still generate the same bytes as in legacy mode this is to allow the user to type and execute commands in the shell such as reset
after a program that sets this mode crashes without clearing it.
Report event types
This progressive enhancement (0b10
) causes the terminal to report key repeat and key release events. Normally only key press events are reported and key repeat events are treated as key press events. See Event types for details on how these are reported.
Report alternate keys
This progressive enhancement (0b100
) causes the terminal to report alternate key values in addition to the main value, to aid in shortcut matching. See Key codes for details on how these are reported.
Report all keys as escape codes
Key events that generate text, such as plain key presses without modifiers, result in just the text being sent, in the legacy protocol. There is no way to be notified of key repeat/release events. These types of events are needed for some applications, such as games (think of movement using the WASD
keys).
This progressive enhancement (0b1000
) turns on key reporting even for key events that generate text. When it is enabled, text will not be sent, instead only key events are sent. If the text is needed as well, combine with the Report associated text enhancement below.
Additionally, with this mode, events for pressing modifier keys are reported. Note that all keys are reported as escape codes, including Enter, Tab, Backspace etc.
Report associated text
This progressive enhancement (0b10000
) causes key events that generate text to be reported as CSI u
escape codes with the text embedded in the escape code. See Text as code points above for details on the mechanism.
Detection of support for this protocol
An application can query the terminal for support of this protocol by sending the escape code querying for the current progressive enhancement status followed by request for the primary device attributes. If an answer for the device attributes is received without getting back an answer for the progressive enhancement the terminal does not support this protocol.
Legacy key event encoding
In the default mode, the terminal uses a legacy encoding for key events. In this encoding, only key press and repeat events are sent and there is no way to distinguish between them. Text is sent directly as UTF-8 bytes.
Any key events not described in this section are sent using the standard CSI u
encoding. This includes keys that are not encodable in the legacy encoding, thereby increasing the space of usable key combinations even without progressive enhancement.
Legacy functional keys
These keys are encoded using three schemes:
CSI number ; modifier ~ CSI 1 ; modifier {ABCDEFHPQS} SS3 {ABCDEFHPQRS}
In the above, if there are no modifiers, the modifier parameter is omitted. The modifier value is encoded as described in the Modifiers section, above. When the second form is used, the number is always 1
and must be omitted if the modifiers field is also absent. The third form becomes the second form when modifiers are present (SS3 is the bytes 0x1b 0x4f
).
These sequences must match entries in the terminfo database for maximum compatibility. The table below lists the key, its terminfo entry name and the escape code used for it by kitty. A different terminal would use whatever escape code is present in its terminfo database for the key. Some keys have an alternate representation when the terminal is in cursor key mode (the smkx/rmkx
terminfo capabilities). This form is used only in cursor key mode and only when no modifiers are present.
Name |
Terminfo name |
Escape code |
---|---|---|
INSERT |
kich1 |
CSI 2 ~ |
DELETE |
kdch1 |
CSI 3 ~ |
PAGE_UP |
kpp |
CSI 5 ~ |
PAGE_DOWN |
knp |
CSI 6 ~ |
UP |
cuu1,kcuu1 |
CSI A, SS3 A |
DOWN |
cud1,kcud1 |
CSI B, SS3 B |
RIGHT |
cuf1,kcuf1 |
CSI C, SS3 C |
LEFT |
cub1,kcub1 |
CSI D, SS3 D |
HOME |
home,khome |
CSI H, SS3 H |
END |
-,kend |
CSI F, SS3 F |
F1 |
kf1 |
SS3 P |
F2 |
kf2 |
SS3 Q |
F3 |
kf3 |
SS3 R |
F4 |
kf4 |
SS3 S |
F5 |
kf5 |
CSI 15 ~ |
F6 |
kf6 |
CSI 17 ~ |
F7 |
kf7 |
CSI 18 ~ |
F8 |
kf8 |
CSI 19 ~ |
F9 |
kf9 |
CSI 20 ~ |
F10 |
kf10 |
CSI 21 ~ |
F11 |
kf11 |
CSI 23 ~ |
F12 |
kf12 |
CSI 24 ~ |
MENU |
kf16 |
CSI 29 ~ |
There are a few more functional keys that have special cased legacy encodings. These are present because they are commonly used and for the sake of legacy terminal applications that get confused when seeing CSI u escape codes:
Key |
No mods |
Ctrl |
Alt |
Shift |
Ctrl + Shift |
Alt + Shift |
Ctrl + Alt |
---|---|---|---|---|---|---|---|
Enter |
0xd |
0xd |
0x1b 0xd |
0xd |
0xd |
0x1b 0xd |
0x1b 0xd |
Escape |
0x1b |
0x1b |
0x1b 0x1b |
0x1b |
0x1b |
0x1b 0x1b |
0x1b 0x1b |
Backspace |
0x7f |
0x8 |
0x1b 0x7f |
0x7f |
0x8 |
0x1b 0x7f |
0x1b 0x8 |
Tab |
0x9 |
0x9 |
0x1b 0x9 |
CSI Z |
CSI Z |
0x1b CSI Z |
0x1b 0x9 |
Space |
0x20 |
0x0 |
0x1b 0x20 |
0x20 |
0x0 |
0x1b 0x20 |
0x1b 0x0 |
Note that Backspace and ctrl+Backspace are swapped in some terminals, this can be detected using the kbs
terminfo property that must correspond to the Backspace key.
All keypad keys are reported as their equivalent non-keypad keys. To distinguish these, use the disambiguate flag.
Legacy text keys
For legacy compatibility, the keys a-z 0-9 ` - = [ ] \ ; ' , . / with the modifiers shift, alt, ctrl, shift+alt, ctrl+alt are output using the following algorithm:
-
If the alt key is pressed output the byte for
ESC (0x1b)
-
If the ctrl modifier is pressed map the key using the table in Legacy ctrl mapping of ASCII keys.
-
Otherwise, if the shift modifier is pressed, output the shifted key, for example,
A
fora
and$
for4
. -
Otherwise, output the key unmodified
Additionally, ctrl+space is output as the NULL byte (0x0)
.
Any other combination of modifiers with these keys is output as the appropriate CSI u
escape code.
Key |
Plain |
shift |
alt |
ctrl |
shift+alt |
alt+ctrl |
ctrl+shift |
---|---|---|---|---|---|---|---|
i |
i (105) |
I (73) |
ESC i |
) (41) |
ESC I |
ESC ) |
CSI 105; 6 u |
3 |
3 (51) |
# (35) |
ESC 3 |
3 (51) |
ESC # |
ESC 3 |
CSI 51; 6 u |
; |
; (59) |
: (58) |
ESC ; |
; (59) |
ESC : |
ESC ; |
CSI 59; 6 u |
Note
Many of the legacy escape codes are ambiguous with multiple different key presses yielding the same escape code(s), for example, ctrl+i is the same as tab, ctrl+m is the same as Enter, ctrl+r is the same ctrl+shift+r, etc. To resolve these use the disambiguate progressive enhancement.
Functional key definitions
All numbers are in the Unicode Private Use Area (57344 - 63743
) except for a handful of keys that use numbers under 32 and 127 (C0 control codes) for legacy compatibility reasons.
Name |
CSI |
Name |
CSI |
---|---|---|---|
ESCAPE |
|
ENTER |
|
TAB |
|
BACKSPACE |
|
INSERT |
|
DELETE |
|
LEFT |
|
RIGHT |
|
UP |
|
DOWN |
|
PAGE_UP |
|
PAGE_DOWN |
|
HOME |
|
END |
|
CAPS_LOCK |
|
SCROLL_LOCK |
|
NUM_LOCK |
|
PRINT_SCREEN |
|
PAUSE |
|
MENU |
|
F1 |
|
F2 |
|
F3 |
|
F4 |
|
F5 |
|
F6 |
|
F7 |
|
F8 |
|
F9 |
|
F10 |
|
F11 |
|
F12 |
|
F13 |
|
F14 |
|
F15 |
|
F16 |
|
F17 |
|
F18 |
|
F19 |
|
F20 |
|
F21 |
|
F22 |
|
F23 |
|
F24 |
|
F25 |
|
F26 |
|
F27 |
|
F28 |
|
F29 |
|
F30 |
|
F31 |
|
F32 |
|
F33 |
|
F34 |
|
F35 |
|
KP_0 |
|
KP_1 |
|
KP_2 |
|
KP_3 |
|
KP_4 |
|
KP_5 |
|
KP_6 |
|
KP_7 |
|
KP_8 |
|
KP_9 |
|
KP_DECIMAL |
|
KP_DIVIDE |
|
KP_MULTIPLY |
|
KP_SUBTRACT |
|
KP_ADD |
|
KP_ENTER |
|
KP_EQUAL |
|
KP_SEPARATOR |
|
KP_LEFT |
|
KP_RIGHT |
|
KP_UP |
|
KP_DOWN |
|
KP_PAGE_UP |
|
KP_PAGE_DOWN |
|
KP_HOME |
|
KP_END |
|
KP_INSERT |
|
KP_DELETE |
|
KP_BEGIN |
|
MEDIA_PLAY |
|
MEDIA_PAUSE |
|
MEDIA_PLAY_PAUSE |
|
MEDIA_REVERSE |
|
MEDIA_STOP |
|
MEDIA_FAST_FORWARD |
|
MEDIA_REWIND |
|
MEDIA_TRACK_NEXT |
|
MEDIA_TRACK_PREVIOUS |
|
MEDIA_RECORD |
|
LOWER_VOLUME |
|
RAISE_VOLUME |
|
MUTE_VOLUME |
|
LEFT_SHIFT |
|
LEFT_CONTROL |
|
LEFT_ALT |
|
LEFT_SUPER |
|
LEFT_HYPER |
|
LEFT_META |
|
RIGHT_SHIFT |
|
RIGHT_CONTROL |
|
RIGHT_ALT |
|
RIGHT_SUPER |
|
RIGHT_HYPER |
|
RIGHT_META |
|
ISO_LEVEL3_SHIFT |
|
ISO_LEVEL5_SHIFT |
|
Note
The escape codes above of the form CSI 1 letter
will omit the 1
if there are no modifiers, since 1
is the default value.
Note
The original version of this specification allowed F3 to be encoded as both CSI R and CSI ~. However, CSI R conflicts with the Cursor Position Report, so it was removed.
Legacy ctrl mapping of ASCII keys
When the ctrl key and another key are pressed on the keyboard, terminals map the result for some keys to a C0 control code i.e. an value from 0 - 31
. This mapping was historically dependent on the layout of hardware terminal keyboards and is not specified anywhere, completely. The best known reference is Table 3-5 in the VT-100 docs.
The table below provides a mapping that is a commonly used superset of the table above. Any ASCII keys not in the table must be left untouched by ctrl.
Key |
Byte |
Key |
Byte |
Key |
Byte |
---|---|---|---|---|---|
SPC |
0 |
/ |
31 |
0 |
48 |
1 |
49 |
2 |
0 |
3 |
27 |
4 |
28 |
5 |
29 |
6 |
30 |
7 |
31 |
8 |
127 |
9 |
57 |
? |
127 |
@ |
0 |
[ |
27 |
\ |
28 |
] |
29 |
^ |
30 |
_ |
31 |
a |
1 |
b |
2 |
c |
3 |
d |
4 |
e |
5 |
f |
6 |
g |
7 |
h |
8 |
i |
9 |
j |
10 |
k |
11 |
l |
12 |
m |
13 |
n |
14 |
o |
15 |
p |
16 |
q |
17 |
r |
18 |
s |
19 |
t |
20 |
u |
21 |
v |
22 |
w |
23 |
x |
24 |
y |
25 |
z |
26 |
~ |
30 |
Bugs in fixterms
The following is a list of errata in the original fixterms proposal, corrected in this specification.
-
No way to disambiguate Esc key presses, other than using 8-bit controls which are undesirable for other reasons
-
Incorrectly claims special keys are sometimes encoded using
CSI letter
encodings when it is actuallySS3 letter
in all terminals newer than a VT-52, which is pretty much everything. -
ctrl+shift+tab should be
CSI 9 ; 6 u
notCSI 1 ; 5 Z
(shift+tab is not a separate key from tab) -
No support for the super modifier.
-
Makes no mention of cursor key mode and how it changes encodings
-
Incorrectly encoding shifted keys when shift modifier is used, for instance, for ctrl+shift+i is encoded as ctrl+I.
-
No way to have non-conflicting escape codes for alt+letter, ctrl+letter, ctrl+alt+letter key presses
-
No way to specify both shifted and unshifted keys for robust shortcut matching (think matching ctrl+shift+equal and ctrl+plus)
-
No way to specify alternate layout key. This is useful for keyboard layouts such as Cyrillic where you want the shortcut ctrl+c to work when pressing the ctrl+С on the keyboard.
-
No way to report repeat and release key events, only key press events
-
No way to report key events for presses that generate text, useful for gaming. Think of using the WASD keys to control movement.
-
Only a small subset of all possible functional keys are assigned numbers.
-
Claims the
CSI u
escape code has no fixed meaning, but has been used for decades asSCORC
for instance by xterm and ansi.sys and DECSMBV by the VT-510 hardware terminal. This doesn’t really matter since these uses are for communication to the terminal not from the terminal. -
Handwaves that ctrl tends to mask with
0x1f
. In actual fact it does this only for some keys. The action of ctrl is not specified and varies between terminals, historically because of different keyboard layouts.
Why xterm’s modifyOtherKeys should not be used
-
Does not support release events
-
Does not fix the issue of Esc key presses not being distinguishable from escape codes.
-
Does not fix the issue of some keypresses generating identical bytes and thus being indistinguishable
-
There is no robust way to query it or manage its state from a program running in the terminal.
-
No support for shifted keys.
-
No support for alternate keyboard layouts.
-
No support for modifiers beyond the basic four.
-
No support for lock keys like Num lock and Caps lock.
-
Is completely unspecified. The most discussion of it available anywhere is here And it contains no specification of what numbers to assign to what function keys beyond running a Perl script on an X11 system!!
from Hacker News https://ift.tt/akCVFLe
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.