Bug 435975 - Implement fixterms
Summary: Implement fixterms
Status: REPORTED
Alias: None
Product: konsole
Classification: Applications
Component: keyboard (other bugs)
Version First Reported In: master
Platform: Other Linux
: NOR wishlist
Target Milestone: ---
Assignee: Konsole Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-20 20:48 UTC by ariasuni
Modified: 2025-12-06 13:50 UTC (History)
7 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ariasuni 2021-04-20 20:48:58 UTC
SUMMARY
Some terminals started implementing fixterms. Kitty implemented a slightly different who apparently fixes some problems with fixterms in 2012.

According to fixterms’ website:

> Keyboard input on Terminals has many deficiencies to it. I want them all
> fixed. I have a plan on how to do it but it Needs Your Help

Neovim implemented the protocol in 2015 by using libtickit:
https://github.com/neovim/neovim/issues/176#issuecomment-77786940

fixterms: http://www.leonerd.org.uk/hacks/fixterms/
kitty’s new keyboard protocol: https://sw.kovidgoyal.net/kitty/keyboard-protocol.html

SOFTWARE/OS VERSIONS
Operating System: Arch Linux
KDE Plasma Version: 5.21.4
KDE Frameworks Version: 5.81.0
Qt Version: 5.15.2
Comment 1 ariasuni 2021-04-20 20:49:59 UTC
Sorry I hit enter too early, I meant:
Kitty implemented in *2021* a slightly different *version* who apparently fixes some problems with fixterms.
Comment 2 ariasuni 2021-04-20 21:15:37 UTC
Also see this comment about the «state of the art» of terminals:

https://github.com/mawww/kakoune/issues/2554#issuecomment-436300959
Comment 3 Ivan Sorokin 2023-04-06 10:13:51 UTC
There are also at least three other protocols for "raw keyboard" input in terminals.

iTerm2 protocol
https://gitlab.com/gnachman/iterm2/-/issues/7440#note_129307021

Windows Terminal's "win32-input-mode"
https://github.com/microsoft/terminal/pull/6309

far2l terminal extensions
https://github.com/elfmz/far2l/blob/master/WinPort/FarTTY.h

The only one of them that already have some support in apps is far2l's one (apps supporting it are "turbo" text editor, "putty4far2l" putty fork, cyd01's "KiTTY"). It is tied to Windows keyboard codes, but translation from x11 codes is easy:
https://github.com/unxed/xkb2win/
Comment 4 Ivan Sorokin 2024-10-06 12:22:05 UTC
I feel it's important to explain why the kitty keyboard protocol is truly essential.

I'm one of the developers of far2l—a port of the Far Manager file manager to Linux.  Since this program was originally developed for the Windows console, it heavily utilizes its advantages, particularly the ability to use any key combinations.

When migrating from Windows to Linux, users look for a familiar UX; they need their favorite applications, not a new operating system.  If we tell them "learn 100 new keyboard shortcuts," they're more likely to stay on Windows than listen to us.

Is it really acceptable that *nix console capabilities in the 21st century are still inferior to those of the ancient Windows 2000 console? This situation needs improvement.

In the far2l project, we came up with a hack that solves the problem under X11: we simply connect to X11 and listen to all keyboard input. This allows us to "refine" the key press information coming into the terminal, "deciphering" what the user actually pressed. A terrible, horribly dirty hack (but it works!)—but better than depriving the user of familiar functionality.

However, with the widespread transition to Wayland, this hack stops working.  What do you think users will do, give up their familiar keyboard shortcuts—or stay on X11?

This is precisely why we need the ability to use any keyboard shortcut in console applications. Not the day after tomorrow, not tomorrow. We needed it yesterday.

Of all the available solutions (far2l terminal extensions, iTerm2 raw keyboard protocol, win32-input-mode, and kitty keyboard protocol), the kitty protocol is the most suitable for UNIX-like operating systems. It's designed with backward compatibility in mind, and adapting terminals to support it is straightforward. Look at this code example. This is a reference implementation of basic kitty protocol support that I made as an example for Windows Terminal developers.  It's only 196 lines of code: https://gist.github.com/unxed/d979fe069039fe075c18eb0218b1f8f5

I hope you will consider these arguments, and together we can bring the console capabilities of UNIX-like operating systems up to par with what the Windows 2000 console offered 24 years ago. I'll be happy to answer any questions if anyone takes this on.  Thank you in advance!
Comment 5 Luca Weiss 2025-10-01 14:22:16 UTC
This should also help with fish shell being able to use some keybinds, e.g. ctrl-backspace doesn't seem to really be possible right now, or at least is ambiguous with ctrl-h.

Ref https://github.com/fish-shell/fish-shell/issues/11538#issuecomment-3356580330
Comment 6 Ivan Sorokin 2025-10-04 12:32:17 UTC
Here is some thoughts on fixterms/kitty approach vs another alternative, win32 input mode. I tend to believe that the win32 input mode is, in fact, much better designed.

Arguments in its favor:

1. It's based on a time-tested keyboard event structure of Windows - there have never been complaints from application developers that something is missing in it. The structure is stable and hasn't changed for decades.

2. It avoids the overcomplication of kitty's mode-state stack.

3. It doesn't include unnecessary and hard-to-obtain field unshifted key. (I understand how to implement this on top of GTK - it would require an additional xkbcommon context that tracks the current kb layout - but that's an absurd amount of terminal-side complexity for something that can be handled in a single line on the application side, simply by normalizing case before hotkey checks.)

4. It clearly separates Unicode character (which depends on layout and modifiers) and key code (which does not), unlike kitty, which allocates four (!) separate fields for the same two concepts: key code, shifted key code, base layout key code, and Unicode text.

5. Accordingly, there is no pointless duplication of the same Unicode character between the shifted field and the Unicode text fiels (as happens in kitty). Moreover, in kitty, for roughly half of all keys, the key code field itself duplicates the same character code as those two fields.

6. Parsing is simpler - no nested delimiters. Because of its simplicity, it's trivial to integrate into applications.

7. It's supported out of the box on every Windows machine.

8. The specification is clear and unambiguous.

9. It doesn't carry the 50-year-old technical debt that kitty indirectly inherits.

10. It's already used outside the Windows ecosystem - for example, in magiblot's Turbo Vision fork and his turbo editor, and also in far2l - Linux port of classical file manager from the Windows world.

On macOS, virtual key codes can be mapped the same way as in Wine, providing a familiar behavior at least for some users.

Essentially, with kitty we get a beautifully formatted specification that is, in at least one case, impossible to fully adhere to - one that contains fundamental design flaws and is excessively overengineered. To be honest, its appeal was greatly overestimated due to the lack of competition.

From an engineering standpoint, a protocol based on win32 input looks like a far more attractive alternative.

And it is already implemented for Konsole:
https://invent.kde.org/utilities/konsole/-/merge_requests/1133
Comment 7 Kovid Goyal 2025-12-02 17:03:00 UTC
(In reply to Ivan Sorokin from comment #6)
> Here is some thoughts on fixterms/kitty approach vs another alternative,
> win32 input mode. I tend to believe that the win32 input mode is, in fact,
> much better designed.

I happened to chance across this and boy it is so full of wrong headed and barely coherent assertions, I dont even know where to start.

> 
> Arguments in its favor:
> 
> 1. It's based on a time-tested keyboard event structure of Windows - there
> have never been complaints from application developers that something is
> missing in it. The structure is stable and hasn't changed for decades.

You want a cross platform protocol to adopt a single OS specification, seriously?!! And you tout that as an advantage!! 

> 
> 2. It avoids the overcomplication of kitty's mode-state stack.

The so called "overcomplication" means, for instance, that at lower enhancement levels, a crash or an ssh disconnect does not leave the terminal in an unusable state where the user cannot even type reset. A basic QoL feature that win32-input-mode lacks. It further means that applications that dont need release events dont have to deal with them. This has the huge advantage that, for example, an application can quit on a keypress not a key release and not worry about the terminal sending the release event after the application has quit leaving garbage to be printed on the tty.

> 
> 3. It doesn't include unnecessary and hard-to-obtain field unshifted key. (I
> understand how to implement this on top of GTK - it would require an
> additional xkbcommon context that tracks the current kb layout - but that's
> an absurd amount of terminal-side complexity for something that can be
> handled in a single line on the application side, simply by normalizing case
> before hotkey checks.)

It's only hard for you, the dozen or so terminal implementations of this protocol disagree with you. And the fact that you think changing between shifted and unshifted keys is a matter of case change shows you have zero understanding of how keyboards work. Please explain to me how a case change will normalize between plus and equal which, in the most widely used keyboard layout in the world, by far, the US PC keyboard layout are distinguished by shift. Indeed, if you spent a bit more time studying how keyboard key mapping worked you would realize that shift can be used to perform arbitrary, layout dependant mapping between unicode codepoints. 

> 
> 4. It clearly separates Unicode character (which depends on layout and
> modifiers) and key code (which does not), unlike kitty, which allocates four
> (!) separate fields for the same two concepts: key code, shifted key code,
> base layout key code, and Unicode text.
> 

Again you dont know what you are talking about. There is no such thing as a "key code". Different hardware keyboards produce different electrical signals these get turned into arbitrary numbers by the OS kernel. These numbers vary from platform to platform. All cross platform keyboard protocols therefore never adopt the numbers from one platform. Instead they define an unambiguous "encoding" using numbers or names or symbols or as in the case on the kitty keyboard protocol unicode codepoints. You declare those "four!!" fields are the same thing. They are not. Every single field can differ in different situations and is needed for maximally robust shortcut matching.


> 5. Accordingly, there is no pointless duplication of the same Unicode
> character between the shifted field and the Unicode text fiels (as happens
> in kitty). Moreover, in kitty, for roughly half of all keys, the key code
> field itself duplicates the same character code as those two fields.

Again the duplication is only pointless to you, because again, you haven't a clue what you are talking about.

> 
> 6. Parsing is simpler - no nested delimiters. Because of its simplicity,
> it's trivial to integrate into applications.

It takes approx 50 lines of code to parse the kitty keyboard protocol key structure. But I suppose that's too complex for you. Why am I not surprised. Proof, the decode_key_event() function in the kitty source code is exactly 37 lines long in Python. It will be about 100 lines in a less expressive language, proof KeyEventFromCSI is 97 lines long in Go in the kitty source code.

> 
> 7. It's supported out of the box on every Windows machine.
> 

And this matters to Konsole, because?

> 8. The specification is clear and unambiguous.

So is the specification of the kitty keyboard protocol

> 
> 9. It doesn't carry the 50-year-old technical debt that kitty indirectly
> inherits.

What technical debt is that?

> 
> 10. It's already used outside the Windows ecosystem - for example, in
> magiblot's Turbo Vision fork and his turbo editor, and also in far2l - Linux
> port of classical file manager from the Windows world.
> 

You mean in two applications at least one of which, you wrote. In contrast lets see the roster of applications supporting the kitty protocol. vim, neovim, kakoune, emacs, dte, helix, flow, yazi, awrit, aerc, far2l and on and on.

> On macOS, virtual key codes can be mapped the same way as in Wine, providing
> a familiar behavior at least for some users.

I have no clue what this means. You are suggesting konsole use wine code on macOS?

> 
> Essentially, with kitty we get a beautifully formatted specification that
> is, in at least one case, impossible to fully adhere to - one that contains
> fundamental design flaws and is excessively overengineered. To be honest,
> its appeal was greatly overestimated due to the lack of competition.

You forgot to add, in your, extremely flawed, opinion.

> 
> From an engineering standpoint, a protocol based on win32 input looks like a
> far more attractive alternative.

To *you*. 

> 
> And it is already implemented for Konsole:
> https://invent.kde.org/utilities/konsole/-/merge_requests/1133

I hope for their sake the Konsole maintainers dont merge code written by you. It will be full of bugs stemming from an appalling lack of understanding of even the basics of how keyboards work.
Comment 8 Ivan Sorokin 2025-12-05 12:24:34 UTC
Thank you for the detailed response, Kovid.

First, I would like to express my regret for the harsh tone of my previous comment. My criticism stemmed largely from the fatigue of attempting to implement the full scope of the kitty protocol specs. I found myself struggling with the complexity of correctly mapping unshifted vs shifted keys across various layouts and handling the duplication of unicode symbols on the terminal side. I respect the work you have done to push the ecosystem forward.

To move this discussion into a constructive direction for Konsole, I believe it is useful to summarize that both approaches have their specific strengths and trade-offs.

The kitty protocol is the current de-facto standard for modern unix-like terminal capabilities with wide adoption among popular applications like neovim and helix. It handles complex state management very robustly.

The win32 input mode offers a very simple structure that is easy to parse and clearly separates virtual key codes from text. It is ideal for applications ported from the windows world or those preferring a simpler event loop.

Given this reality, in far2l we decided against a "one or the other" approach. Instead, we implemented both, to give application developers the freedom to choose what fits best.

However, regarding the kitty protocol support in our project, we opted for a pragmatic subset rather than the full specification to keep complexity manageable. We implemented the core event structure but omitted the complexity of the full flags stack, using a single apply mode. We also simplified the unshifted vs shifted logic, as we discussed previously on the kitty bug tracker.

It is worth noting that this implemented subset has proven sufficient for all practical cases. We have not received a single complaint regarding missing capabilities in the kitty protocol implementation in far2l for several years.

I believe supporting both (the industry-standard kitty protocol as a functional subset and the simpler win32 mode) provides the best flexibility for users and developers.
Comment 9 Ivan Sorokin 2025-12-05 12:34:16 UTC
I used here "preferring a simpler event loop" as a shorthand for "preferring simpler input parsing and handling logic."

Parsing win32 input mode is simpler - no nested delimiters. Because of its simplicity, it’s trivial to integrate into applications.

In win32 input mode, the application receives a clear separation of Virtual Key Code and Unicode Character. The developer doesn’t need to write logic to determine which field to use in which case. The code inside the application’s main loop that processes input can essentially be: Read packet -> Map directly to internal action.
Comment 10 Ivan Sorokin 2025-12-05 15:30:51 UTC
Here's a more detailed list of kitty protocol features not supported in far2l. Perhaps implementing this protocol in Konsole in the same scope could be a good start, reducing the implementation complexity to a reasonable level while still covering a significant portion, if not all, of real-world application needs:

1. The keyboard flags stack operations (`push`/`pop`) and independent state maintenance for the alternate screen are not implemented.
2. The bitwise flag manipulation modes (set, OR, AND-NOT) for the `CSI =` command are not supported.
3. Key repeat events (event type 2) are not supported.
4. `Super`, `Hyper`, and `Meta` modifiers are not supported.
5. PUA code mappings are missing for Keypad keys, extended function keys (`F13`-`F35`) and Multimedia keys.
6. Keys pressed with `Shift` report the shifted keycode instead of the unshifted keycode.
7. The "associated text" field contain the same value as shifted keycode.
8. UTF-16 surrogate pairs (e.g., Emoji) are transmitted as two separate events rather than a single UTF-32 codepoint.
9. IME input bypass the escape sequence generation even in full reporting mode (Mode 8).
10. The implementation ignores the Application Cursor Keys mode (`DECCKM`) state when encoding arrow keys.
Comment 11 Kovid Goyal 2025-12-06 03:53:58 UTC
I can certainly appreciate that implementing the kitty protocol in a terminal takes some effort, but it is very much *necessary* effort and I strongly suggest that if Konsole wants to implement it, it should do so properly with full support, with all features of the protocol, it takes roughly 500 lines of C code to implement the key encoding and another few hundred to implement support for tracking unshifted and base layout keys, which, IMO is a very small price to pay to enable robust key handling for all terminal programs. See kitty/key_encoding.c (encode_glfw_key_event) and xkb_glfw.c for how to translate X11 and wayland key events into GLFW key events used by encode_glfw_key_event. Including the table mapping XKB key names to GLFW key names, this all comes out to less than 1500 lines of code. 

This code is pretty trivially ported to Konsole since Konsole is in C++ and based on Qt. The QKeyEvent class gives us the nice nativeModifiers(), nativeScanCode() and nativeVirtualKey() functions which can be fed directly into the code in xkb_glfw.c to convert them into a normalised form for encoding to the terminal. The remaining slightly hairy bits will be handling keyboard layout change events. Again you can directly lift the code from xkb_glfw.c using the Qt facility to access native events (in this case the keyboard change events).
Comment 12 Ivan Sorokin 2025-12-06 13:50:23 UTC
I'd like to point out that I'm not suggesting not implementing all the protocol's capabilities. I'm proposing an initial volume, the first milestone, which will still be useful in its own right.