Why Is It so Hard to Detect Keyup Event on Linux?
2019-01-10 - By Robert Elder
Introduction
This article is dedicated to discussing why it's so hard to detect the 'key up' or 'key release' event in a Linux terminal environment without relying on X server. Several techniques and code examples will be shown that are capable of detecting the keyup event on Linux (with and without an X server), but all techniques presented here have significant limitations due to historical reasons that will be discussed. These limitations will be discussed in terms of implementation details of the Linux console, terminal emulators, X server, and SSH based environments.
A reader of this article is most likely interested in this topic for the same reason that I was before I dove into this topic: They are interested in performing some real-time based task that is controlled using keyboard presses. In my case, the goal was to remotely navigate a robot over an SSH connection using the 'w', 'a', 's', 'd' keys. Real-time tasks like this require extremely high responsiveness to key events for palatable performance. Methods such as polling 'getch' are inadequate for applications that require high responsiveness since they must contend with features like keyboard debouncing, key auto-repeating, and missed keyup events that happen when multiple keys are pressed together. Furthermore, naive implementations of this problem often employ polling which burns up 100% CPU, or 'delay' functions which increase response time (or a combination of both). An example of this simple and naive implementation is presented at the end of this article. Another implementation that avoids both of these pitfalls will be shown using I/O multiplexing.
As a high-level summary, here are your only options if you want to detect minimum latency, real-time keyup events on Linux:
- You can detect keyup in any Linux terminal/console (non-graphical environment), but this always requires elevated privileges. Also, if you're running this program in a graphical terminal environment like desktop Ubuntu, you will notice that you get keyup events globally from all applications regardless of whether they have keyboard focus or not.
- You can detect keyup events sourced from the currently focused window in an X server (graphical environment) without gaining root privileges, but this will require that the computer that collects the events be running an X server, and the computer that collects the events be running software that can receive and process these X events. Therefore, if you're doing this over an SSH connection, you'll need to use X forwarding, which may not be possible if your remote client is a headless server, or a constrained embedded system.[3]
If you don't care about minimizing latency and are ok with some false positive, see this section.
The Simplest Way To Detect Key Release Anywhere in Linux
If you're just here for a quick python solution to detect keyup/keydown events in a Linux terminal/console environment, you can use the 'keyboard' package:
# At least one of these should work with python2:
sudo pip install keyboard
sudo pip2 install keyboard
# This should work with python3
sudo pip3 install keyboard
And here is a corresponding example program that will print a message on key release and key press down:
import signal
import keyboard
import time
class MyKeyEventClass1(object):
def __init__(self):
self.done = False
signal.signal(signal.SIGINT, self.cleanup)
keyboard.hook(self.my_on_key_event)
while not self.done:
time.sleep(1) # Wait for Ctrl+C
def cleanup(self, signum, frame):
self.done = True
def my_on_key_event(self, e):
print("Got key release event: " + str(e))
a = MyKeyEventClass1()
Now, if you run the above program, you'll find that it requires root privileges[1]. Root privileges to detect they key release event? Really? You obviously don't need to be root to detect when a key is pressed, so why do you need root to detect when it gets released? Why does such a fundamental and common event require root access? Furthermore, why do we also get all events from every window regardless of whether it has focus or not? Well, let's take a look at what the python 'keyboard' module is doing internally. Here is a stripped down version of how it detects key events on Linux:
import signal
import re
import threading
import time
import struct
class MyKeyEventClass2(object):
def __init__(self):
self.done = False
signal.signal(signal.SIGINT, self.cleanup)
with open('/proc/bus/input/devices') as f:
devices_file_contents = f.read()
for handlers in re.findall(r"""H: Handlers=([^\n]+)""", devices_file_contents, re.DOTALL):
dev_event_file = '/dev/input/event' + re.search(r'event(\d+)', handlers).group(1)
if 'kbd' in handlers:
t = threading.Thread(target=self.read_events, kwargs={'dev_event_file': dev_event_file})
t.daemon = True
t.start()
while not self.done: # Wait for Ctrl+C
time.sleep(1)
def cleanup(self, signum, frame):
self.done = True
def read_events(self, dev_event_file):
print("Listening for kbd events on dev_event_file=" + str(dev_event_file))
try:
of = open(dev_event_file, 'rb')
except IOError as e:
if e.strerror == 'Permission denied':
print("You don't have read permission on ({}). Are you root?".format(dev_event_file))
return
while True:
event_bin_format = 'llHHI' # See kernel documentation for 'struct input_event'
# For details, read section 5 of this document:
# https://www.kernel.org/doc/Documentation/input/input.txt
data = of.read(struct.calcsize(event_bin_format))
seconds, microseconds, e_type, code, value = struct.unpack(event_bin_format, data)
full_time = seconds + microseconds / 1000000
if e_type == 0x1: # 0x1 == EV_KEY means key press or release.
d = ("RELEASE" if value == 0 else "PRESS") # value == 0 release, value == 1 press
print("Got key " + d + " from " + str(dev_event_file) + ": t=" + str(full_time) + "us type=" + str(e_type) + " code=" + str(code))
a = MyKeyEventClass2()
The above code works by parsing the file at '/proc/bus/input/devices', and looking for any input nodes that are associated with the 'kbd' handler. Once if finds all of these devices files (at locations like '/dev/input/event4'), it opens each of these files and creates dedicated threads to do blocking reads from each of these event files. When a keyboard event occurs, the Linux kernel communicates it through these event files using a format that is described in Linux Kernel Input. Each read event will return a 'struct input_event' which is described at the link just mentioned. Note that the above example only gives you 'keycodes' (not ASCII) so you have to decode these to actual 'keys' (explained in a later section).
The fact that this example sources the events from /dev/input/event* shows that this application is collecting all events from the entire system, rather than ones that are specific to any user on the system let alone from any individual application. This explains why we need root access, since this is basically a key logger that reads keystrokes bound for all applications on the system. It also explains why it gives us all events regardless of which window has focus or not.
So why can't we just let our application query for the current 'stdin' or 'tty' or 'pty' or whatever and read keyboard events from there? Well, here's the key historical insight: A tty doesn't know about the concept of key press or key release events. It only knows about 'data stream in' and 'data stream out'. Many commonly used methods to detect terminal key press events in terminal are actually not detecting the key press, they are detecting the 'new input character being available event' which happens to be correlate closely with a key press. There is no such corresponding event that a tty knows about for the 'key up' event.
It is, however, possible to use some ioctl magic and tty mode changing tricks to peek into the implementation details of the current tty and detect if there is a local keyboard attached somewhere. You can then change the tty mode to directly read keyboard codes and detect key release this way, but this also requires root privileges, and doesn't take into consideration the window 'focus' of which terminal you are currently using. An example of this will be demonstrated later.
Since many of you are likely using 'terminals' within the context of a graphical Linux distribution using an X server, it might seem like a strange bug that the techniques discussed so far don't consider the current 'terminal window focus' when generating these key events. This can be explained easily: Your 'terminal' (such as gnome-terminal, xterm, terminator etc.) is a terminal emulator. It is a modern software emulation based on the historical model of the teletype terminal, which is basically just an electronic typewriter. In fact, you can find videos on YouTube of people using a teletype terminals as real Linux consoles and it still works! Many of you probably have never touched a typewriter before, but if you have, it might make a bit more sense to you as to why a modern Unix 'tty' doesn't support informing you about a 'key up' event: On an old typewriter the primary thing you're communicating is 'characters' and events like 'key up' never make it onto the page. In this sense, key up events are effectively an irrelevant implementation detail of the typing process as far as the teletype is concerned. The most that holding down a key for longer will get you is repeated character input onto your page, but you'll never be able to determine how many key up events took place (and in what order) from looking at the resulting sheet of paper.
Detecting Key Release In An SSH Connection
Are you interested in detecting local 'key up' events over an SSH connection without involving an X server? Oh boy, are you in for a disappointment! It turns out, that it's impossible. I don't mean the "I tried really hard and couldn't figure out how to do it, so it must therefore be impossible" kind of impossible, but the real kind of impossible where the keyup event doesn't get communicated at all during a non-X forwarded SSH session. Here is an experiment you can do to prove this to yourself: Open two terminals side-by-side. In one terminal window, run this tcpdump command which will show all the packets that are sent over the ssh session (assuming the machine you ssh into is listening on port 22):
sudo tcpdump dst port 22
Now, in your other terminal window, ssh into your remote machine and run the following sequence of key presses/releases while logged into the remote server: 1) Hold down the 'a' key and keep it held down. 2) While keeping the 'a' key held down, now press and hold down the 'b' key. All this while, notice that you can see packets being sent due to holding down the 'a' and 'b' keys. 3) Now, release the 'b' key while keeping the 'a' key held down. Notice that there are no more packets being sent to the remote server since pressing down the 'b' key canceled the autorepeat of the 'a' key. 4) Now, release the 'a' key and take note of how many packets are sent through SSH connection. Zero! There is no way the remote machine is going to detect the fact that you released the 'a' key if there are 0 bytes of information transmitted about the event!
So, what do you do when you want to send 'key release' events over SSH when the target machine doesn't have an X server? Well, you have to build your own client/server application where a client/server listens locally on the machine with the keyboard, and then forwards these events to the remote machine where you're running the applications that needs to respond to these events. That sounds like a lot of work because you have to set up all the sockets and custom messaging protocols, but there's no way around it because the key release events simply aren't forwarded over your terminal-based SSH session when there is no X server forwarding configured.
Something worth noting to avoid confusion is that if you run the example python keyboard example in the introductory paragraph of this article over an SSH connection, the code will still work and run, but when you type characters, nothing will happen. What gives?!? That's because it will be detecting 'local' keyboard events from the machine you just SSHed into! If it's a cloud server like EC2 or something, there probably aren't any keyboards attached to it!
If, however, you are running X on the source and destination machine, you can send key release events quite easily. When you start your SSH connection, simply include the '-X' parameter:
ssh -X my-remote
The C++ example included in the next section can be used to demonstrate detecting keyup event over an X forwarded SSH connection.
C++ Code to Detect Key Release in Linux X Server
If you're not bound by the requirement to be able to detect the key release event in a terminal/console environment, you can of course use XLib. The example code below will open a small window that will detect key press and key release events. This example doesn't require root, and it will behave as a user would expect with key events only being detected when the user has focus of the window.
/* Dependencies:
sudo apt-get install libx11-dev
I used the following command to compile and run this code:
g++ main.cpp -g -L/usr/X11/lib -lX11 && ./a.out
*/
#include <stdio.h>
#include <stdlib.h>
#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <X11/Xos.h>
#include <X11/Xatom.h>
#include <X11/keysym.h>
bool was_it_auto_repeat(Display * d, XEvent * event, int current_type, int next_type){
/* Holding down a key will cause 'autorepeat' to send fake keyup/keydown events, but we want to ignore these: '*/
if(event->type == current_type && XEventsQueued(d, QueuedAfterReading)){
XEvent nev;
XPeekEvent(d, &nev);
return (nev.type == next_type && nev.xkey.time == event->xkey.time && nev.xkey.keycode == event->xkey.keycode);
}
return false;
}
int main () {
Display * d = XOpenDisplay(NULL);
Window win = XCreateSimpleWindow(d, RootWindow(d, 0), 1, 1, 400, 300, 0, BlackPixel(d, 0), BlackPixel(d, 0));
XSelectInput(d, win, KeyPressMask | KeyReleaseMask | ClientMessage);
XMapWindow(d, win);
XFlush(d);
XEvent event;
Atom closeMessage = XInternAtom(d, "WM_DELETE_WINDOW", True);
XSetWMProtocols(d, win, &closeMessage, 1);
bool done = false;
while(!done) {
XNextEvent(d, &event);
switch(event.type) {
case KeyPress: {
fprintf(stdout, "key #%ld was pressed.\n", (long) XLookupKeysym(&event.xkey, 0));
break;
}case KeyRelease:{
if(was_it_auto_repeat(d, &event, KeyRelease, KeyPress)){
XNextEvent(d, &event); /* Consume the extra event so we can ignore it. */
}else{
fprintf(stdout, "key #%ld was released.\n", (long) XLookupKeysym(&event.xkey, 0));
}
break;
}case ClientMessage:{
if ((Atom)event.xclient.data.l[0] == closeMessage) {
done = true;
}
break;
}
}
}
XDestroyWindow(d, win);
XCloseDisplay(d);
return 0;
}
The example above includes an extra utility function that attempts to filter out the 'auto repeat' characters that you get from holding a key down. There is a function to turn this off in X Server, but this can affect other applications and the method used above will avoid this problem.
Finally, if you want to use python, here is an example of detecting key release using pygame that depends on X server:
import pygame
import time
pygame.init()
d = pygame.display.set_mode((800,600))
d.fill((255,255,255))
done = False
while not done:
for event in pygame.event.get():
if event.type == pygame.QUIT:
done = True
if event.type == pygame.KEYUP:
print("Got keyup event: " + str(event))
pygame.quit()
quit()
Minimizing Response Time With Poll or Select
We're not done talking about the solution to the original problem! Now that we have some example code that can detect the key up even in a non-graphical environment, we still want to keep response time to a minimum. The example above using 'keyboard' example uses multiple threads to actually listen for the events on each event device. Using extra threads means the 'keyboard' module will have to prevent race conditions on buffers (which requires locking) and perform more context switching before the key release data is actually dealt with and processed. The model that I would prefer to use employs 'select' or 'poll':
import signal
import re
import struct
import select
import os
class MyKeyEventClass3(object):
def on_fd_read(self, fd):
recv_return = bytearray(os.read(fd, struct.calcsize(self.event_bin_format)))
seconds, microseconds, e_type, code, value = struct.unpack(self.event_bin_format, recv_return)
full_time = seconds + microseconds / 1000000
if e_type == 0x1: # 0x1 == EV_KEY means key press or release.
d = ("RELEASE" if value == 0 else "PRESS") # value == 0 release, value == 1 press
print("Got key " + d + " from fd " + str(fd) + ": t=" + str(full_time) + "us type=" + str(e_type) + " code=" + str(code))
def __init__(self):
self.event_bin_format = 'llHHI' # See kernel documentation for 'struct input_event'
self.done = False
signal.signal(signal.SIGINT, self.cleanup)
self.poller = select.poll()
initial_event_mask = select.POLLIN | select.POLLPRI | select.POLLHUP | select.POLLERR
with open('/proc/bus/input/devices') as f:
devices_file_contents = f.read()
files = {}
for handlers in re.findall(r"""H: Handlers=([^\n]+)""", devices_file_contents, re.DOTALL):
dev_event_file = '/dev/input/event' + re.search(r'event(\d+)', handlers).group(1)
if 'kbd' in handlers:
try:
files[dev_event_file] = open(dev_event_file, 'rb')
# Listen for events on this socket:
self.poller.register(files[dev_event_file].fileno(), initial_event_mask)
print("Listening to " + str(dev_event_file) + " on fd " + str(files[dev_event_file].fileno()))
except IOError as e:
if e.strerror == 'Permission denied':
print("You don't have read permission on ({}). Are you root?".format(dev_event_file))
return
while not self.done:
try:
events = self.poller.poll(None)
for fd, flag in events:
if flag & (select.POLLIN | select.POLLPRI):
self.on_fd_read(fd)
if flag & (select.POLLHUP | select.POLLERR):
return # Lost the file descriptor
except Exception as e:
return # Probably interrupted system call
def cleanup(self, signum, frame):
self.done = True
a = MyKeyEventClass3()
The reason I prefer this method is because this allows us to deal with the incoming data from the key press using the same APIs that we would use to send the data elsewhere once we've got it. This is because APIs like select and poll allow you to operate on network sockets and files as if they were the same kinds of objects! In a client/server model where the key press event gets sent to a remote robot, we don't really care about the key event on the computer where it occurred, we just want to get it sent to the robot as fast as possible. If we use separate threads, there is still going to be a (very small) amount of latency involved with switching between the event reading thread and the event processing thread once we do get the event. Locking and unlocking primitives are also usually implemented as kernel calls which also introduce delay. As a bonus, your code is now single-threaded!
Converting Keycodes to Keys and Characters
We're still not done because most of the examples explained above only give you the 'keycodes' and not the actual decoded keys that you'd recognize. On Linux, you can use a program called 'dumpkeys' to get the mappings from keycodes -> keys. Here's a couple python utility functions that will do this for you:
import subprocess
import re
def get_keymap_as_string():
try:
# External call to 'dumpkeys' executable.
child = subprocess.Popen('dumpkeys', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = child.communicate()
except Exception as e:
print("An exception happend when trying to get the keymap data: " + str(e) + ". Note that you need to be root in order for dumpkeys to work.?")
return None
if child.returncode > 0:
print("Return code was " + str(child.returncode) + " when getting keymap data. stderr was " + str(stderr) + "")
return None
else:
return stdout.decode("utf-8")
def parse_keymap_file(s):
m = {}
try:
for line in s.splitlines():
trimmed_line = line.strip()
if re.match(r"^keycode.*", trimmed_line):
# Example format of the split format of each line we expect:
# ['keycode', '30', '=', '+a']
# Some keymappings will have multiple things they map to, but we just take the first one:
# ['keycode', '86', '=', 'less', 'greater', 'bar']
parts = line.split()
if(len(parts) >= 4):
m[int(parts[1])] = parts[3]
return m
except Exception as e:
print("An exception happend when trying to parse the keymap data: " + str(e) + ".")
return None
s = get_keymap_as_string()
if s:
keymap = parse_keymap_file(s)
if keymap is not None:
some_keycodes = [30,31,32, 100]
print("Here are examples of decoded keycodes:")
for k in some_keycodes:
print("keycode=" + str(k) + " key=" + str(keymap[k]))
else:
print("Error while decoding keycode map.")
else:
print("Unable to obtain keycode map, perhaps you need to use 'sudo'?")
For me, the above will output the following:
Here are examples of decoded keycodes:
keycode=30 key=+a
keycode=31 key=+s
keycode=32 key=+d
keycode=100 key=Alt
The above code might not be very portable, and you may have to dig into the parsing yourself to make adjustments, but it was adequate for my purposes.
Detecting Keyup Using ioctl and MEDIUMRAW
There is yet another way to detect key release events in a terminal environment that doesn't involve reading from /dev/input/event*, but unfortunately, this method also requires root and has the same 'window focus' related problem due to the fundamental nature of ttys. The following example code shows how to detect keyup events by reading directly from the first console/tty like device that can be identified to have a keyboard associated with it:
import signal
import re
import array
import fcntl
import struct
import termios
import os
import sys
import subprocess
import traceback
class PyKeyUpKeyDown(object):
def __init__(self):
# All of the following constants are defined in
# the Linux kernel: include/uapi/linux/kd.h
self.KDGKBMODE = 0x4B44 # Get current keyboard mode
self.KDSKBMODE = 0x4B45 # Set current keyboard mode
self.KDGKBTYPE = 0x4B33 # Get current keyboard type
self.K_MEDIUMRAW = 0x02 # Medium raw (keycode) mode
# Operational variables for this class:
self.original_mode = None
self.old_attr = None
self.fd = None
self.done = False
signal.signal(signal.SIGINT, self.shutdown)
signal.signal(signal.SIGHUP, self.shutdown)
signal.signal(signal.SIGQUIT, self.shutdown)
signal.signal(signal.SIGILL, self.shutdown)
rtn = self.setup_keylisten()
if rtn:
print("Successfully set up keylistener. Press Ctrl+c to exit.")
while not self.done:
e = self.get_next_key_event()
self.on_key_event(e)
else:
print("Was unable to set up keylistener. Perhaps you need to use 'sudo'?")
self.cleanup()
def on_key_event(self, e):
if e is not None:
print("Observed %s for keycode=%u" % ("keyup" if e['is_up'] else "Keydown", e['keycode']))
def shutdown(self, signum, frame):
print("Caught signal %s. Shutting down." % (str(signum)))
self.done = True
def has_a_keyboard(self, f):
# Looking inside the linux kernel in include/uapi/linux/kd.h,
# it looks like the kernel defines all of
# KB_84 0x01
# KB_101 0x02
# KB_OTHER 0x03
# as potential return values of ioctl -> tty3270_ioctl -> kbd_ioctl, for self.KDGKBTYPE
# however, only KB_101 is ever returned anywhere...
# Therefore, if this ioctl doesn't trigger an exception, assume
# that it is associated with an underlying keyboard.
try:
buf = array.array('i', [0])
fcntl.ioctl(f, self.KDGKBTYPE, buf, 1)
return True
except:
return False
def identify_keyboard_sources(self, a):
rtn = {}
for x in a:
rtn[x] = {}
f = None
try:
f = os.open(x, os.O_RDONLY, 0)
rtn[x]['has_keyboard'] = self.has_a_keyboard(f)
rtn[x]['exception_on_open'] = False
except Exception as e:
rtn[x]['has_keyboard'] = False
rtn[x]['exception_on_open'] = str(e)
if f is not None:
os.close(f)
return rtn
def cleanup(self):
if self.original_mode is not None:
if self.fd:
fcntl.ioctl(self.fd, self.KDSKBMODE, self.original_mode)
if self.old_attr is not None:
if self.fd:
termios.tcsetattr(self.fd, 0, self.old_attr)
if self.fd is not None:
if self.fd:
os.close(self.fd)
self.fd = None
def modeset(self):
# These file paths sourced from getfd.c (See 'showkey' Linux tool source code.)
sources = self.identify_keyboard_sources(["/dev/tty", "/dev/tty0", "/dev/console", "/dev/vc/0"])
self.fd = None
for source in sources:
if sources[source]['has_keyboard']:
self.fd = os.open(source, os.O_RDONLY, 0)
break
if self.fd == None:
sys.stdout.write("No keyboard device could be identified. Perhaps you forgot to use 'sudo'?\n")
sys.exit(1)
# Query to determine the current keyboard mode so we can restore it later.
buf = array.array('i', [0])
fcntl.ioctl(self.fd, self.KDGKBMODE, buf, True)
self.original_mode = buf[0]
# Save terminal parameters to restore them later.
self.old_attr = termios.tcgetattr(self.fd)
self.new_attr = termios.tcgetattr(self.fd)
# See comments and code in cpython/Modules/termios.c on method
# 'termios_tcgetattr(PyObject *self, PyObject *args)' of python standard library implementation.
# in Modules/termios.c
self.new_attr[0] = 0 # iflag
# self.new_attr[1] is oflags.
# self.new_attr[2] is cflags.
# Turn off canonical mode, turn of character echo, turn on control character signals.
self.new_attr[3] = ((self.new_attr[3] & ~termios.ICANON & ~termios.ECHO) | termios.ISIG) # lflags
# self.new_attr[4] is ispeed.
# self.new_attr[5] is ospeed.
# See http://www.unixwiz.net/techtips/termios-vmin-vtime.html for
# details on VMIN and VTIME. Curent values give blocking reads for maximum responsiveness:
self.new_attr[6][termios.VMIN] = 1
self.new_attr[6][termios.VTIME] = 0
# Apply the terminal mode changes:
termios.tcsetattr(self.fd, termios.TCSAFLUSH, self.new_attr)
fcntl.ioctl(self.fd, self.KDSKBMODE, self.K_MEDIUMRAW)
def get_keyboard_file_descriptor(self):
return self.fd
def get_next_key_event(self):
try:
if self.fd:
buf = bytearray(os.read(self.fd, 1))
return self.key_process(buf)
else:
return None
except Exception as e:
return None
def key_process(self, buf):
i_c = 0
while i_c < len(buf):
s = (buf[i_c+0] & 0x80)
# This calculation is implemented in showkey.c of the kdb package.
# I think it has a dependency on having at least a 2.6 kernel.
if i_c + 2 < len(buf) and (buf[i_c+0] & 0x7f) == 0 and (buf[i_c+1] & 0x80 != 0) and (buf[i_c+2] & 0x80 != 0):
kc = (buf[i_c+1] & 0x7f) << 7 | (buf[i_c+2] & 0x7f)
i_c += 3
else:
kc = (buf[i_c] & 0x7f)
i_c += 1
return {
'keycode': kc,
'is_up': bool(s)
}
def setup_keylisten(self):
try:
self.modeset()
except Exception as e:
traceback.print_exc()
sys.stdout.write("An exception happend when while listening to keycodes: " + str(e) + ".\n")
return False
return True
a = PyKeyUpKeyDown()
The above code is a lot more verbose and requires various terminal mode hacks that are quite ugly (especially when the terminal isn't restored properly due to an error case), so I do prefer the /dev/input based method, but I included this example anyway because it may (or may not?) be a bit more portable if there are cases out there where searching for devices subscribed to the 'kbd' handler doesn't work. The above 'os.read' call can also be substituted by a 'poll' or 'select' call in a server too. The same keycode map referenced earlier can be used to decode characters in this example too. This example was heavily inspired from the source code in showkey.
NCURSES
I have read online that that there is a demo program in ncurses that illustrates a way of detecting key up event, but I compiled all the examples from source and I wasn't able find it. I believe that there probably is no such example in ncurses, but if you know of a counter-example, please shoot me an email to correct me.
Conclusion
Let's revisit the original question in the title: 'Why is it so hard to detect the keyup event in Linux?'. The answer is due to the fact that pure tty based environments in Linux have not historically needed support for 'key' related events but are instead only concerned with 'streams' and 'characters'. Although it is technically possible to detect the key release events in terminal if there is an attached keyboard, this requires root privileges since there is no way to distinguish between key events for different applications. Furthermore, monitoring key events for all other applications would be considered a security issue. Modern graphical environments that host an X server can isolate key up events for a individual applications, although if you want to forward these events in real-time to a remote server that doesn't host an X server, you will need to build your own custom protocol to send these events. If both computers support an X server, you can use X forwarding over SSH with the '-X' argument.
Corrections & Updates (Added on 2021-09-12)
[1] On many systems, the files in '/dev/input' belong to group 'input', so you wouldn't need to be 'root' as long as your user was a member of the 'input' group.
[2] Another resource that people mentioned for reading keyboard events on Linux was the SDL Library. See SDL_KEYDOWN and SDL_KEYUP. As far as I can tell, this library also collects keyboard events from '/dev/input' as described above (so it doesn't really offer a superior solution).
[3] Made a correction here regarding software requirements for X servers.
[4] This article made the front page of HackerNews twice, so there are lots of additional comments and insights to read: Jan 27, 2019 and May 5, 2021.
Corrections & Updates (Added on 2021-11-26)
This article was written specifically to discuss detection of keyup with a blocking I/O model that will minimize latency, but not burn up extra CPU. However, after writing this article, I've received some comments from people who have more flexible use cases. In particular, someone discussed their need to trigger a garage door opener upon the keyup event. For this use case, the following simple solution will probably be adequate and does not require any extra permissions to be run. However, this solution isn't perfect so take note of the caveats below:
import select
import sys
import tty
import termios
# Demonstrates how to detect something close to
# a 'keyup' event in terminal with a slight delay.
# Does not require elevated privileges, but
# can suffer from false positives and false negatives
# depending on keyboard auto-repeat timer values.
class MyPollingDelayTimeoutClass(object):
def __init__(self):
self.isDone = False
# If this value it set too low, you can get false positives
# If this value it set too high, you can get false negatives
self.SELECT_TIMEOUT = 0.5
def main_loop(self):
is_key_currently_down = False
fd = sys.stdin.fileno()
old_settings = termios.tcgetattr(fd)
tty.setraw(sys.stdin.fileno())
while not self.isDone:
in_files = select.select([sys.stdin], [], [], self.SELECT_TIMEOUT)[0]
if in_files:
for file in in_files:
is_key_currently_down = True
c = file.read(1)
sys.stdout.write("Saw char: '" + c + "; = " + format(ord(c), "x") + "\n\r");
if c == '\x03':
self.isDone = True
sys.stdout.write("Got signal to cleanup...\n\r");
else:
if is_key_currently_down == True:
sys.stdout.write("Observed transition from is_key_currently_down=True to is_key_currently_down=True. Must be a keyup event.\n\r");
is_key_currently_down = False
termios.tcsetattr(fd, termios.TCSAFLUSH, old_settings)
a = MyPollingDelayTimeoutClass()
a.main_loop()
However, this solution suffers from a few pitfalls: If the value specified for SELECT_TIMEOUT is too low, it will conclude that there are more 'keyup' events than there actually are. If this value is set too high, it will miss some keyup events entirely. The behaviour of false negatives and positives will depend on the current value of the keyboard auto-repeat feature on your system. In a graphical session, this will be controlled by the 'X' server and can be changed using a command like this:
# Change initial delay after pressing character to 100ms, and
# delay time between characters to 20ms
# Warning: Running this command will likely make it hard to type normally.
xset r rate 100 20
As you can see, there are two different values that are used for auto-repeat settings, and when they are different (they are by default), it will be impossible to avoid both false-positives, and false-negatives as discussed above.
If you're working in a non-graphical session, there will likely be a completely different set of settings to control the 'console' auto-repeat behaviour..
If you decide to modify the auto-repeat value to a lower one and set both of them to be the same value, you will find yourself in a trade-off position where you can get high responsiveness to keyup events if you set the timeout value to be low enough (but you'll waste lots of CPU), or you can save CPU by settling for a lower-response time. Neither of these cases is ideal, hence the need to write this article. In you are interested in experimenting with the timings, also see the VMIN and VTIME termios values mentioned earlier in the article.
Confusion About Polling
This article uses the word 'poll' in two confusingly similar contexts. The first is related to the computer science concept of polling which is used to describe a situation where you perform a check over and over (which usually burns up CPU). This is contrasted from event-based logic that doesn't use extra CPU and waits for something a callback or interrupt to fire before more processing happens.
The second reference to 'poll' is in the use of the kernel function 'poll' (and its friends select and epoll), which are used to 'poll' for events... except that their internal implementations don't use polling at all (unless I'm very mistaken). In fact, every programmer who uses poll/select/epoll does so specifically because they want to avoid polling and use a more efficient blocking model for I/O instead. Therefore, the function 'poll' exists to do the exact opposite thing that its name says, which understandably leads to confusion.
Virtual Memory With 256 Bytes of RAM - Interactive Demo
Published 2016-01-10 |
$1.00 CAD |
How To Force The 'true' Command To Return 'false'
Published 2023-07-09 |
The Regular Expression Visualizer, Simulator & Cross-Compiler Tool
Published 2020-07-09 |
A Surprisingly Common Mistake Involving Wildcards & The Find Command
Published 2020-01-21 |
A Guide to Recording 660FPS Video On A $6 Raspberry Pi Camera
Published 2019-08-01 |
The Most Confusing Grep Mistakes I've Ever Made
Published 2020-11-02 |
Use The 'tail' Command To Monitor Everything
Published 2021-04-08 |
Join My Mailing List Privacy Policy |
Why Bother Subscribing?
|