Skip to content

egl dri

bjornstahl edited this page Sep 11, 2017 · 8 revisions

The VIDEO_PLATFORM=egl-dri on linux and BSD is special in many ways, and the complexity in setting it up varies a lot on your wants and needs. Configuration and tuning is currently limited to a set of environment variables (run arcan without any arguments to list them), though this will be refactored shortly to use a more flexible configuration format as part of improving multi-vendor-GPU support.

Nvidia-binary specific details.

First, since it is built around the drm/kms infrastructure it requires a recent kernel (4.4+), a Mesa/libdrm build that exposes EGL and GL21 with support for your graphics card and preferably also render-nodes support (for arcan_lwa and accelerated game frameserver). These prerequisites are similar to what Wayland composers like Weston has. MAKE SURE that you have AT LEAST one /dev/dri/card node and hopefully also matching renderD nodes. Otherwise you need to fix your kernel/driver situation first or you won't get anywhere.

Second, environment variables to consider: Note that we do not follow- or care much- for the XDG_ set of specifications -- if those are relevant to your case, use a wrapper shellscript that maps the set of folders etc. to the namespaces the engine uses internally. See the manpage for ARCAN*PATH entries.

Third, permissions and devices. By default, the engine will just grab the first available /dev/dri/cardN entry, and if your preferences differ, set the corresponding environment variable (ARCAN_VIDEO_DEVICE). You will need permissions on that device node (some distributions map this to graphics) and possibly root/capability to become drmMaster (don't ask...) or try your luck with the ARCAN_VIDEO_DRM_NOMASTER. The reason why this isn't easier/more configurable right now is due to engine refactoring to support multi-GPU and GPU hotplugging. Note that some device node creation setups will not give you a deterministic allocation for cardN with multiple GPUs, fantastic. There has also been some reported success by having the cardN in the group arcan will be run as, and having the renderD nodes in a group that untrusted clients can access.

Running arcan without any arguments will list the ones that are relevant to the Video and Input platform. Some that are of especial interest here are ARCAN_VIDEO_WAIT_CONNECTOR, ARCAN_VIDEO_DRM_NOMASTER, ARCAN_VIDEO_DRM_NOBUFFER, ARCAN_INPUT_SCANDIR.

The corresponding (linux) event backend deliberately avoids udev. Either give the user arcan is running as access to the /dev/input set of nodes (usually by being added to the input group) or have the udev setup generate a suitable folder of nodes, and refer to it using the ARCAN_INPUT_SCANDIR environment.

The backend will use inotify to monitor that folder for new entries and try to probe / take control over those nodes when that happen. Note that we also need to have permission to run inotify on the folder. Restricting this to a whitelist is a good move in the days of devices like rubberDucky, but also because there is a ton of terrible input devices out there that generates many kHz of bad/broken samples that add a lot of processing overhead.

Keyboard maps, translation and other external notification systems are not part of the engine but rather of the running appl. There is a tools/conv.c utility for converting the linux native keymaps (the ones you'd use loadkeys for) to .lua scripts that works with the Durden appl and others that use the symtable.lua support script for translation tables, but it is not doing a fantastic job. The truth of the matter is that there's no good, reliable keymap format that can be used in linux. The 'least' broken / best documented format is the one used in android, but it's adaptation in desktop hardware configurations is limited at best.

Note, currently, the synchronization strategies in this platform are rather primitive. To handle multiple displays tear-free, we currently let the VSYNC of the primary display determine buffer swap and then let other displays update and synch when ready. This means that realtime content like games and videos may be less smooth on displays other than the primary one. This limit comes from the lack of lower level APIs and driver support. The features that will eventually fix this problems are 'atomic modeset' and 'synchronization fences', but both are very much works in progress.

Note, another concession is that although there is support for virtual terminal switching (and "login managers"), it can (and should) be disabled, there are just too many race conditions in every layer available for this to work reliably for everyone and every occasion. Reasons for that is the underlying interface being complete shit and has a number of side effects that may or may not be relevant. Most engine features are in place to support multiple parallel sessions, or, in XDG terminology "seats" but we considered it a bad design and a bad idea, that will receive considerably less favor, attention and priority than important use-cases (reliable suspend resume, low energy consumption, ...).

Note, the platform currently does a poor (actually nothing at all) job doing hotplug detection (why this isn't provided through the normal device and ioctls but rather resort to sysfs scraping is surprising at the least, there might be valid reasons hiding in the drivers) and relies on an explicit rescan called from the scripting layer -- which can stall the graphics pipeline for several hundred milliseconds. Durden, for instance, permits rescan commands over the command channel named pipe -- hook that up to some other event layer and you're there. The problem is that there seem to be quite a few hard-to-catch race conditions (we are talking kernel crashes) from rapid and wild plug/unplug while-scanning operations.

Note, drm/kms hardware support varies wildly, with a lot of instabilities directly related to the running kernel version etc. Concurrent use of multiple GPUs from the same or even from different vendors is not working, but it is a priority.

Note, if arcan is running incredibly sluggish and taking up high CPU use, check so that you are not accidentally running Mesa with the llvmpipe / software fallback. In many distributions, the packages are actually split up, with individual mesa packages for each GPU driver. In void linux, for instance, if you install mesa but forget mesa-intel-dri (assuming an intel GPU), things may very well work, but perform slowly.

Note, getting rid of X dependencies MesaGL and libdrm packaging typically come with dependencies to the entire X- set of libraries and everything that entails. It is possible, however, to build them without (assuming you don't want to use any X compatibility layers either). Some reported success have been by cloning and building separately with arguments to MesaGL configure (add gallium drivers to fit your hardware):

./configure --enable-gles2 --enable-gles1 --disable-glx --enable-egl --enable-gallium-egl --with-gallium-drivers=nouveau,i915,radeonsi,swrast --enable-gallium-osmesa --with-egl-platforms=drm

In addition, the OPENGL_gl_LIBRARY in CMakeCache should point to libOSMesa.so and this is not always detected by the find scripts.

NVidia-binary specifics

There is experimental support for running recent versions of the nvidia binary drivers, but expect it to have quite some flaws when it comes to resolution switching, multi-displays, synchronization/timing and hot-plugging for a while. The normal Xorg- like restrictions apply, such as making sure there's no conflict with the Mesa-GL implementation and that the nouveau driver is blacklisted.

You will need to load the nvidia-drm module with the modeset=1 argument and set the ARCAN_VIDEO_EGL_DEVICE environment variable. This is a temporary restriction due to the lack of a more flexible configuration setup, and difficulties getting our probing setup to understand which buffer transfer mechanism to use.

Accelerated clients won't get to use handle/buffer passing either at the moment, we're still missing code in shmif/egl-dri to register as a "EGL External Platform interface" provider (seems like a decent workaround to the EGL spec though).