Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guix, grafts, gee… #96

Open
LordYuuma opened this issue Oct 25, 2020 · 57 comments
Open

Guix, grafts, gee… #96

LordYuuma opened this issue Oct 25, 2020 · 57 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@LordYuuma
Copy link
Collaborator

LordYuuma commented Oct 25, 2020

Part of our design goals seems to be building against one version of GLib/GObject/GI, while allowing users to load any version. Today I am here to tell you, that this is extremely broken.

Why Guix?

Guix is just the messenger, but there is a stronger reason behind why GI appears rather broken in any Guix package, and that is grafts.

What are grafts?

Grafts are Guix' way of not rebuilding the world when an important security is rolled out. Basically, they allow you to build and link against old versions of a library while running the program against a new one. Traditional distros do that all the time and you don't even notice, but on Guix you actually have two versions of that library still lying around. The ungrafted one and the grafted one.

Why is this an issue?

Because it is possible to get those two mixed up, e.g. in guix environment. I am not sure, which use cases are affected, but that one surely is. To see the difference, run

./configure --without-gir-hacks
make clean check

once inside a guix environment with grafts, and once in one without them. If you want to use Guix environments to prototype your applications, that means you'll have to use --no-grafts to work around these types of issues for now.

What to do from now on?

It is pretty clear to me, that the main culprit here is a different version of GLib being linked to Guile-GI than the one that should be loaded through Guile-GI. To fix that, we'll probably have to overhaul our entire bootstrapping procedure starting at GTypes. And we'll likely have to preload some version of GObject before defining them. Much fun.

Workaround

If you are working on Guile-GI code inside guix environment and do not wish to be haunted by this issue and how to perhaps resolve it, for the time being add --with-gir-hacks to your invocation of ./configure. If you are experiencing similar issues in your own GI-based projects, consider patching your GIRs in a manner similar to what we do.

@LordYuuma LordYuuma added the help wanted Extra attention is needed label Oct 25, 2020
@LordYuuma LordYuuma pinned this issue Oct 25, 2020
@ZelphirKaltstahl
Copy link

I've only followed this repo on the side and I don't really understand much about it. I'm only interested in building nice GTK apps using GNU Guile at some point. That said, since you already started explaining, let me ask some uninformed, possibly dumb questions : )

  • How long would it take to "rebuild the world" for this project using --no-grafts?
  • If --no-grafts solves the problem already, why would you need to overhaul the entire bootstrapping procedure?
  • Why does one usually not notice grafting, when it is done by traditional distros?
  • Is there no way to tell Guix which version of a library to use, when it grafts?
  • If this is a bug in GNU Guix, why not wait / develop for a bugfix instead of throwing things away? Or it too unlikely that the behavior will be fixed?

@LordYuuma
Copy link
Collaborator Author

  • --no-grafts does not cause any rebuilds, you're stuck with the old library version.
  • I am really not certain, that --no-grafts is a fix. If development environments are the only thing this bug affects, then fine, but having packaged some GI applications for Guix I am not sure what exactly happens there.
  • Traditional distros don't graft. They merely place an updated shared library in the same location, which Guix can't do.
  • As far as I know it's difficult to predict and not something a library should actually care about.
  • It is not a Guix-specific bug, Guix is just the messenger. The issue comes from having two versions of GLib/GObject side by side – the one we link against vs. the one we load dynamically through typelibs. The only thing special about Guix is that this routinely happens with GObject-2.0 against GObject-2.0, whereas on other distros you'd probably notice it if you tried loading GObject-1.0 or GObject-3.0 (were they to exist) through Guile-GI.

@daym
Copy link

daym commented Oct 26, 2020

Wait, shouldn't Guix grafting also graft guile-gi (it should actually edit the shared objects in the derivation, too)? Then everything would be fine.

Does this problem actually happen when using an installed/dependent guix guile-gi package?

Right now I'm always using a manual git checkout of guile-gi, so that can't be grafted of course (because guix doesn't know that that checkout exists). Also, I recompile it very rarely, even as I update guix (read: it's stale).

I don't want to be the one responsible for a lot of unnecessary rework. Please make sure it's actually required and I've not been making the problem seem worse than it is.

@LordYuuma
Copy link
Collaborator Author

You are not the only one responsible for this. I use Guix myself as a basis for developing this library and am very weirded out by having to resort to such hacks.

There is so far no precedence for Guile-GI packages in Guix, which might also have to do with the fact, that the guile-gi recipe on Guix was rather broken for a long time (is it fixed now? I don't remember). I would assume some weird workaround would be required to get them to run as with all the PyGI and GJS packages.

@daym
Copy link

daym commented Oct 26, 2020

$ cat run
#!/bin/sh
exec ${HOME}/src/guile-gi/guile-gi-dannym/guile-gi/tools/uninstalled-env guix repl -L . "$@"
#exec guix environment -l ${HOME}/src/guile-gi/guile-gi-dannym/guile-gi/guix.scm --ad-hoc guile gdk-pixbuf adwaita-icon-theme shared-mime-info -- "$@" ${HOME}/src/guile-gi/guile-gi-dannym/guile-gi/tools/uninstalled-env guix repl a.scm
 dannym@dayas ~/src/guix-gui$ strace -f ./run main.scm 2>&1 |grep open  |grep glib |grep 'libglib.*\.so' |grep -v -- '-1'
[pid  5665] openat(AT_FDCWD, "/gnu/store/dp5l10lbgh66ap4idqvmkfms1qgjsj4r-profile/lib/libglib-2.0.so.0", O_RDONLY|O_CLOEXEC) = 15
[pid  5665] openat(AT_FDCWD, "/gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libglib-2.0.so.0", O_RDONLY|O_CLOEXEC) = 16

In order to debug this, I'd LD_PRELOAD something that overwrites open and prevents one of them from opening. That way, hopefully the requestor will fail and then we know who it is:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <string.h>
#include <stdio.h>

typedef void *(*dlopen_t)(const char *filename, int flags);

void *dlopen(const char *filename, int flags) {
        void* result;
        dlopen_t dlopen = dlsym(RTLD_NEXT, "dlopen");
        if (filename && strstr(filename, "xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libglib-2.0")) {
                fprintf(stderr, "dlopen %s\n", filename);
                return NULL;
        }
        result = dlopen(filename, flags);
        return result;
}

@LordYuuma
Copy link
Collaborator Author

I'm sorry to say that, but I don't get any meaningful results (or even results at all) from adding this to LD_PRELOAD. It doesn't even appear to execute at all.

@daym
Copy link

daym commented Oct 26, 2020

I do. Result: dlopen is sometimes called without full path (for example: libcairo-gobject.so.2)! if that is a string literal in some executable file, that is not good--because those references won't be found by the grafter.

New version (inside ~/src/guile-gi/guile-gi-dannym/guile-gi):

Create block-open.c:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>

typedef void *(*dlopen_t)(const char *filename, int flags);

void *dlopen(const char *filename, int flags) {
        void* result;
        dlopen_t dlopen = dlsym(RTLD_NEXT, "dlopen");
        fprintf(stderr, "dlopen %s\n", filename);
        if (filename && strstr(filename, "/") == NULL && strstr(filename, "libcairo-gobject.so")) {
                fprintf(stderr, "dlopen blocked %s\n", filename);
                result = dlopen(filename, flags);
                if (result) {
                        fprintf(stderr, "and would have been found! Aborting!\n");
                        int* x = 0;
                        *x = 5;
                        abort();
                }
                return NULL;
        }
        result = dlopen(filename, flags);
        return result;
}

Then compile it via gcc -fPIC -shared -o block-open.so block-open.c.

Then create run scipt:

#!/bin/sh -e

LD_PRELOAD="$PWD/block-open.so" guix environment --preserve=LD_PRELOAD -l guix.scm --ad-hoc guile gdk-pixbuf adwaita-icon-theme shared-mime-info -- make tools/uninstalled-env tools/run-guile "$@"

I also edited tools/run-guile bottom to say:

exec ${top_builddir}/tools/uninstalled-env ${top_builddir}/libtool --mode=execute \
     -dlopen ${top_builddir}/libguile-gi.la \
     gdb --args "/gnu/store/0l5a4vx5w8xv6xwq7a6s7hc4r1790lvl-profile/bin/guile" "$@"

Then LD_PRELOAD=$PWD/block-open.so ./run examples/button1.scm.

Then r.

I got this:

[...]
dlopen libcairo-gobject.so.2
dlopen blocked libcairo-gobject.so.2
and would have been found! Aborting!
(gdb) bt
#0  0x00007ffff7fc724c in dlopen () from /home/dannym/src/guile-gi/guile-gi-dannym/guile-gi/block-open.so
#1  0x00007ffff3a4d7e9 in g_module_open () from /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgmodule-2.0.so.0
#2  0x00007ffff3beaac1 in g_typelib_symbol () from /gnu/store/dp5l10lbgh66ap4idqvmkfms1qgjsj4r-profile/lib/libgirepository-1.0.so.1
#3  0x00007ffff3be4475 in g_registered_type_info_get_g_type () from /gnu/store/dp5l10lbgh66ap4idqvmkfms1qgjsj4r-profile/lib/libgirepository-1.0.so.1
#4  0x00007ffff3c1f350 in gig_type_meta_init_from_type_info (meta=meta@entry=0x4ccdf0, type_info=type_info@entry=0x4c7540) at src/gig_data_type.c:231
#5  0x00007ffff3c1f72a in gig_type_meta_init_from_arg_info (meta=0x4ccdf0, ai=0x4c6cf0) at src/gig_data_type.c:33
#6  0x00007ffff3c23375 in arg_map_apply_function_info (func_info=0x4cb370, amap=<optimized out>) at src/gig_arg_map.c:108
#7  gig_amap_new (name=name@entry=0x4cc6c0 "container:propagate-draw", function_info=function_info@entry=0x4cb370) at src/gig_arg_map.c:69
#8  0x00007ffff3c26d83 in create_gsubr (specializers=0x7fffffffcb28, formals=0x7fffffffcb20, optional_input_count=0x7fffffffcb1c, required_input_count=0x7fffffffcb18, self_type=0x7ffff353d580, name=0x4cc6c0 "container:propagate-draw", function_info=0x4cb370) at src/gig_function.c:377

@daym
Copy link

daym commented Oct 26, 2020

Reading the source code of gobject-introspection, they do _g_typelib_do_dlopen in order to actually dlopen (that was inlined).

Aaaand that was patched by Guix.

          /* 'gobject-introspection' doesn't store the path of shared
             libraries into '.typelib' and '.gir' files.  Shared
             libraries are searched for in the dynamic linker search
             path.  In Guix we patch 'gobject-introspection' such that
             it stores the absolute path of shared libraries in
             '.typelib' and '.gir' files.  Here, in order to minimize
             side effects, we make sure that if the library is not
             found at the indicated path location, we try with just
             the basename and the system dynamic library
             infrastructure, as per default behaviour of the
             library. */
          module = load_one_shared_library (shlibs[i]);
          if (module == NULL && g_path_is_absolute (shlibs[i]))
            {
              module = load_one_shared_library (g_basename(shlibs[i]));
            }

I suspect it gets into the if block body.

A kingdom for a g_debug in there...

@LordYuuma
Copy link
Collaborator Author

I think I have something simpler:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>

typedef void *(*dlopen_t) (const char *filename, int flags);
#ifndef DLOPEN_BREAK_KEY
#define DLOPEN_BREAK_KEY "gobject"
#endif

void *dlopen(const char *filename, int flags) {
  fprintf (stderr, "dlopen %s", filename);
  dlopen_t dlopen = dlsym (RTLD_NEXT, "dlopen");
  if (strstr (filename, DLOPEN_BREAK_KEY)) asm volatile ("int $03");
  return dlopen (filename, args);
}

That allows you to GDB into the dlopens of any particular shared library simply by defining DLOPEN_BREAK_KEY. And as expected, it gets called via g_module_open in the first load-info of test/insanity.scm.

@daym
Copy link

daym commented Oct 26, 2020

Good idea!

I've read through gobject-introspection source code by now and it seems that gobject-introspection upstream take it upon themselves to provide gir files for a few other libraries (like cairo--see gir/cairo-1.0.gir.in in gobject-introspection.

But if I understand the patch that Guix did to gobject-introspection's build system (not pasted above) correctly, then they only pick up libraries in the output of the package currently built. Well, currently we are building gobject-introspection, not cairo. So it won't pick up cairo. That's why it's in there with a relative path (a fallback of the guix patch).

The reason why it then doesn't fail nicely at startup of guile-gi as it should is because I have cairo in my profile (it was probably propagated by something--I can't remove it).

Other gir files provided in gobject-introspection are: DBus DBusGLib fontconfig freetype2 gio GL libxml2 Vulkan win32 xfixes xft xlib xrandr--so those will cause trouble eventually if they refer to glib or any other gobjects (some of those definitely do. If such a package is additionally propagated into a profile, you are gonna have a hell of a time finding the problem--as we did).

It's possible to manually specify --fallback-library-path= (presumably on g-ir-scanner), so it could be a workaround to make gobject-introspection depend on an union of the respective packages (see above), and then specify --fallback-library-path manually.

In any case, I think this is a Guix bug (at least additionally).

@LordYuuma
Copy link
Collaborator Author

LordYuuma commented Oct 26, 2020

This is somewhat off-topic, but I think the issue becomes clear if you do the following:

$ ls -l $GUIX_ENVIRONMENT/lib/libgobject-2.0.so
$ grep libgobject $GUIX_ENVIRONMENT/share/gir-1.0/GObject-2.0.gir

Perhaps Guix fails to graft typelibs at all? In that case, we might as well file an upstream bug, but the fundamental issue remains, that we can't load any GObject other than the one we're linking against.

Interestingly guix build glib and guix build glib --no-grafts also return different results from the two paths listed above. There is probably a glib-minimal for building packages, so even if we were to fix the grafting issue, the general issue still remains.

@daym
Copy link

daym commented Oct 26, 2020

What does

readlink $GUIX_ENVIRONMENT/share/gir-1.0/GObject-2.0.gir

say?

FWIW, for me, there seem to be references to absolute paths of libgobject in the typelibs (and the girs):

~/x/gobject-introspection-1.62.0/compile/gir$ strings  GObject-2.0.typelib  |grep libgob
/gnu/store/xkfc1275h55ynpgfr3wwmzy9707nblwc-glib-2.62.6/lib/libgobject-2.0.so.0
$ strings `guix build gobject-introspection`/lib/girepository-1.0/GObject-2.0.typelib |grep libgobject
/gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgobject-2.0.so.0

I'm 100% sure I found how the weird reference got in. The problem is making Guix find the problem automatically.

It's because:

(1) I don't update my user profile often
(2) guix environment "updates" a lot
(3) gobject-introspection refers to cairo by relative name (to dlopen); reason: cairo is not a dependency to gobject-introspection; and even if it was, the Guix patch to gobject-introspection as it is now wouldn't pick it up anyway.
(4) cairo uses gobject too nowadays (!!!!!), and thus refers to (another) glib
(5) cairo was propagated into my user profile ages ago, and is stale
(6) There's a cairo reference and thus gobject-introspection loads the cairo library (with the RELATIVE name, see above). It gets the one from (5). That refers to ANOTHER glib. See above.

Next I'm trying to fix Guix's gobject-introspection to FAIL instead of embedding relative references if built inside a guix build container.

@LordYuuma
Copy link
Collaborator Author

LordYuuma commented Oct 26, 2020

Hmm, it appears the gir inside $GUIX_ENVIROMNENT is already grafted and grafting does actually change stuff, but it's still a different gobject shlib.

Your venture into the depths of cairo is nice and all, but keep things simple and use test/insanity.scm.

@daym
Copy link

daym commented Oct 26, 2020

The point is that the "depth of cairo" loaded another version of glib. So that's where another version of glib comes from.

I'm trying test/insanity.scm now--I only now saw the new file. Thanks.

@LordYuuma
Copy link
Collaborator Author

The thing is, you don't need cairo to load a different version of GLib. It already happens with GObject alone.

@daym
Copy link

daym commented Oct 26, 2020

I agree.

But what guix's gobject-introspection does is a wrong thing in general, and libcairo-gobject is being loaded by guile-gi right now (note: I did not directly use cairo anywhere). That is not going to end well.

Fixing it might fix this problem and other potential problems, too.

(I tried making gobject-introspection depend on cairo by now. That causes a circular reference. Sigh.)

In any case, trying insanity.scm now.

@daym
Copy link

daym commented Oct 26, 2020

Ohhh, libguile-gi links to libgobject at compile time, too. Well, that's gonna be a problem if that library is different to the one dlopen'ed...

@LordYuuma
Copy link
Collaborator Author

You'll get cairo through GTK, Pango, and many other graphical stuff and the recipe seems to contain gtk+ even though it is afaik not strictly necessary.

I highly doubt you'll find a fix to this in Guix. It is likelier that whatever change you make breaks something in PyGI or GJS instead, so please be cautious. A patch to dynamically link libguile-gi against GObject etc. would be very welcome on the other hand.

@daym
Copy link

daym commented Oct 26, 2020

I highly doubt you'll find a fix to this in Guix.

Right now I'll settle for a procedure to flag this problem in Guix--nevermind fixing it.

A patch to dynamically link libguile-gi against GObject etc. would be very welcome on the other hand.

I don't know at all how something like that would look.

gobject-introspection seems to require glib (see below). Things it loads then also depend on glib, but not necessarily on the same version. That is not going to end well.

ldd /gnu/store/64xq4j8b181s6yz7gpg4w8ny3i6r6irk-gobject-introspection-1.62.0/lib/libgirepository-1.0.so |grep glib-
libglib-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libglib-2.0.so.0 (0x00007fb5b5c8f000)
libgobject-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgobject-2.0.so.0 (0x00007fb5b5c30000)
libgmodule-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgmodule-2.0.so.0 (0x00007fb5b5c29000)
libgio-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-2.62.6/lib/libgio-2.0.so.0 (0x00007fb5b5a5b000)

I don't think that this can be fixed in guile-gi. It can be fixed in gobject-introspection (by changing its architecture), I guess.

@LordYuuma
Copy link
Collaborator Author

ldd shows dynamic links, or is this a static link? In that case, having libguile-gi linked statically against GLib is a problem. Though either way dynamic linking would just be a crutch. The real issue is that we as libguile-gi mix our GLib into the GLib that the user actually wants to load and that's wrong.

@daym
Copy link

daym commented Oct 26, 2020

ldd shows what would be loaded when loading this so. It can only show things that are known without actually executing user code of that so. That means it shows things that are in the header of that so. But the loader ld.so will load those using dlopen when loading that so, in the course of running an executable. Guix uses ld's rpath option in order to make sure those headers always contain full paths.

The real issue is that we as libguile-gi mix our GLib into the GLib that the user actually wants to load and that's wrong.

So does gobject-introspection, and that's just as wrong.

@daym
Copy link

daym commented Oct 26, 2020

In the end this is the usual recursive definition problem. I don't get why people always do that--that has to lead to problems sooner or later.

For example having a compiler written in the language of that compiler, or xslt (which is a language for transforming xml) specs itself written in XML etc.

In this case, gobject-introspection is supposed to make glib usable in target languages other than C.

In a sane world that would mean that a binding generator doesn't use glib as a non-native input. Or if it absolutely has to (it really shouldn't [1]), then at least it wouldn't expose that glib or any of its contents to target language users (because philosophically, that's just wrong--even if it happens to work sometimes), i.e. it should be a native-input.

But no, gobject-introspection has glib as a regular input. (Trying to move gobject-introspection to native-inputs, I get build system meson does not support cross-compilation--see https://issues.guix.gnu.org/44244 )

If it has to do stuff like that, you'd think at least it would have two levels: a meta-level where it mangles definitions of glib, and that just happens to use glib for the mangling internally (but not ever expose that glib or any of its objects directly to the user), and another normal level where it actually provides glib to the target language. But no :(

Personally I'd write a replacement for libgirepository that just parses the girs or typelibs on its own for guile-gi (not using glib or girepository.so to do it). Then it's much simpler--both practically and philosophically.

Because what gir files do is describe glib at a C level. The interface of the glib is gobjects, but the implementation of glib is C (Pascal got this right in 1970, only to be ignored by almost everyone). So there's no reason for a binding generator to depend on glib at all--it would have to use the gobject interface if it did.

Of course there'll be some "syntactic sugar" provided for the target language depending on the library when it is loaded--but that's pretty much it. I remember pygtk doing this right ("override" files) a long long time ago.

An easy way to catch those problems early-on is to try to cross-compile your program. If it tries the self-reference outlined above, that would cause a failure (which is what you want) because the architectures of the parts don't match.

[1] because eventually it will become part of glib, and what then?

@spk121
Copy link
Owner

spk121 commented Oct 28, 2020

I'm not sure if this adds any useful information to this discussion, but, on Linux, you can pull the results of dlopening in the following fashion. Compile this with -ldl

#define _GNU_SOURCE
#include <link.h>
#include <stdlib.h>
#include <stdio.h>

static int
callback(struct dl_phdr_info *info, size_t size, void *data)
{
    printf("Name: \"%s\" (%d segments)\n", info->dlpi_name,
           info->dlpi_phnum);
    return 0;
}

int
main(int argc, char *argv[])
{
    void *handle;
    handle = dlopen("libgtk-3.so", RTLD_LAZY);

    dl_iterate_phdr(callback, handle);

    exit(EXIT_SUCCESS);
}

@spk121
Copy link
Owner

spk121 commented Oct 28, 2020

Personally I'd write a replacement for libgirepository that just parses the girs or typelibs on its own for guile-gi (not using glib or girepository.so to do it). Then it's much simpler--both practically and philosophically.

To guess the level of effort, I built libgirepository without linking to glib. I get 250 glib procedures that would need stubs. But guile-gi doesn't use the whole of libgirepository, so that's an overbound.
But say we circumvented linking to glib, would the underlying libffi dependency cause similar problems?

My rough estimate was done this way, after removing glib from meson.build

ninja 2>&1 | awk -F '`' '{print $2}' | awk -F "'" '{print $1}' | sort | uniq | grep g_ | wc

@LordYuuma
Copy link
Collaborator Author

Not as far as I know, but I'm personally not convinced that this is the right move here. Philosophically speaking, it wouldn't be much of an introspection if one was doing it from the outside, would it?

I've had a short look at G-Golf and they seem to be doing stuff similar to us in that they partly export the base GLib that they get, but I assume this is fine for them, since they use dynamic links for GObject. That being said, I still don't have any experience with using G-Golf as a library, but there seem to be projects built on it in Guix, so it appears likely to work out that way.

TL;DR: Dynamic linking would probably be at least a short-term solution. For the long term we should think about what "introspecting your twin" really means.

@daym Do you count GIBaseInfo being a registered struct type as "exposing GLib to the user"? Because that's pretty much the only thing I can think of that fits your description here.

@daym
Copy link

daym commented Oct 29, 2020

Philosophically speaking, it wouldn't be much of an introspection if one was doing it from the outside, would it?

It's doing it from the outside anyway because C has no introspection (and neither does glib, generally--except for small islands). They can be faking it, but that doesn't change this fundamental fact.

But I know what you mean: gobject-introspection itself wants to be a gobject.

But if gobject-introspection itself wants to be a gobject, then it should be made impossible to load another glib using gobject-introspection (making the rest of glib similar to what compiler or shell builtins would be).

@daym Do you count GIBaseInfo being a registered struct type as "exposing GLib to the user"? Because that's pretty much the only thing I can think of that fits your description here.

Yeah.

In the end I don't see how gobject-introspection can work reliably on Guix like that--nevermind guile-gi for the time being.

The easiest way to find out in detail what is what would be to remove glib from the dependencies of gobject-introspection (and their header files) entirely and stub the things below like written below (also remove #include <glib.h> and #include <glib-object.h> from the gobject-introspection public interface). Ideally, it should still build and install. Does it?

If not, add glib back to the package dependencies but remove #include <glib.h> and #include <glib-object.h> from the gobject-introspection public interface. That should definitely work (but would still be pretty bad). If not, that's definitely very bad.

Object-like glib interface types that even gitypelib-internal.h uses:

typedef struct _GMappedFile GMappedFile;
typedef struct _GList GList;
typedef struct _GITypelib GITypelib;
typedef int GQuark; // this one is even returned as a NON-pointer
typedef struct _GError GError;
typedef struct _GIBaseInfo GIBaseInfo;

And the public interface of gobject-introspection has:

GType                  g_base_info_gtype_get_type   (void) G_GNUC_CONST; // sigh...

Primitive glib interface types which are used and fine to use since they have obvious definitions and shouldn't change (and could be just be defined manually):

#define G_BEGIN_DECLS
#define G_END_DECLS
#define GI_AVAILABLE_IN_ALL

typedef char gchar;
typedef unsigned char guchar;
typedef unsigned char guint8;
typedef unsigned short guint16;
typedef unsigned int guint32;
typedef unsigned int guint;
typedef int gint; // for gboolean
typedef signed char gint8;
typedef int gint32;
typedef unsigned long gsize;
typedef gint gboolean;

I'm all for dynamically loading stuff from glib in guile-gi, but I just want to make sure first that this actually fixes the entire problem. Otherwise the problem will be back, one library over there.

(Also, the "cairo" problem won't vanish: gobject-introspection totally can load yet another glib version when traversing through to cairo, even after all those fixes. And it does traverse there. That could also happen with other libraries like gnome highlevel libs etc. I don't want to single glib and cairo out--it's just an example)

All in all, I wonder if GNOME can be made to work reliably like Guix wants it to at all. After all, gobject.so has a central type registry and thus having two gobject.so loaded (even indirectly) in the same executable is going to make things weird...

@spk121
Copy link
Owner

spk121 commented Dec 29, 2021

Well, this has been sitting here for a while. As someone who rarely uses Guix, I have a simplistic understanding, but, I gather that a step toward improvement could be something like this:

  • avoid using GLib and GObject directly
  • use standard C library plus the API provided by libgirepository, libguile, and libffi
  • use libgirepository functionality and the GObject typelib to get any functions and types we need from GObject at runtime
  • ensure that no procedure defined by or used by libguile-gi has the same link name as one from GLib or GObject

@LordYuuma
Copy link
Collaborator Author

That would be a big step towards improvement, as far as I can see, yes. The open question (whether libgirepository allows us to load a GObject typelib other than the one it was linked against) does remain, but there's no way of answering that other than trying to implement such a thing.

spk121 added a commit that referenced this issue Jan 2, 2022
The only functions that need default linkage are those called with
load-extension. The new macro GIG_API explicitly marks a function
as visibile.

* src/gig_visibility.h (GIG_API, GIG_LOCAL): new macros
* src/gig.h: new file, make gig_init_visible
* src/gig_document.h: make gig_init_document visible
* src/gig_callback.h: make gig_init_callback visible
* src/gig_closure.h: make gig_init_closure visible
* src/gig_logging.h: make gig_init_logging visible
* src/gig_object.h: make gig_init_object visible
* src/gig_repository.h: make gig_init_repository visible
* src/gig_type.h: make gig_init_types visible
* src/gig_value.h: make git_init_value visible
* src/gig_util.h: updated
* src/gig.c: use new header
* src/gig_logging.c: use new header
* src/gig_document.c: use new header
* Makefile.am: add new header files, don't export gig_* functions by default
  (libguile_gi_la_CPPFLAGS): new macro BUILDING_GIG
  (CFLAG_VISIBILITY): new macro
spk121 added a commit that referenced this issue Jan 3, 2022
* src/gig_logging.c (gig_log_custom_helper): use strcmp instead
spk121 added a commit that referenced this issue Jan 3, 2022
* src/gig_util.h: declare xstrdup
* src/gig_util.c (xstrdup): new function
  (g_registered_type_info_get_qualified_name): use xstrdup
* src/gig_function.c (create_gsubr): use xstrdup
* src/gig_callback.c (callback_binding_inner): use xstrdup
* src/gig_arg_map.c (arg_map_entry_init): use xstrdup
spk121 added a commit that referenced this issue Jan 3, 2022
* src/gig_argument.c (scm_to_c_string): use xstrndup
* src/gig_util.c (xstrndup): new function
* src/gig_util.h: declare xstrndup
spk121 added a commit that referenced this issue Jan 3, 2022
* src/gig_logging.c (gig_log_writer): use getenv
spk121 added a commit that referenced this issue Jan 3, 2022
@spk121
Copy link
Owner

spk121 commented Mar 7, 2022

Well, I did do some hacking in a branch called type_init_noodle2

It doesn't solve the problem but it does take a couple of small steps in that direction

  • Internally, GLib functions are only used when converting function arguments
  • The creation of scheme classes for GTypes now happens as a result of function calls in the scheme side of (gi types). The idea was that this would ultimate make it easier to order them to occur after a call to require. There could be a two-step initialization: those classes that don't depend on GLib/GObject versions could happen first, the rest could happen later.

It does appear that to solve the problem, the right approach was as previously suggested: reimplementing GIRepository itself.

@LordYuuma
Copy link
Collaborator Author

As a stylistic choice, I think we ought to avoid "_pub.h" and instead put private implementation details into a _private.h, as is the normal GNOME convention. I haven't had too much of a look at the internals, but structurally speaking, this might be a improvement. Yet, as far as linking is concerned $(GOBJECT_INTROSPECTION_LIBS) $(GLIB_LIBS) $(GOBJECT_LIBS) means we're still at square 1.

@spk121
Copy link
Owner

spk121 commented Mar 12, 2022

OK. From here, split the libguile-gi.so into two separate shared libraries: a preprocessor of typelib, and a runtime.

The preprocessor will link to libgirepository, parse typelibs, and output intermediate information for gig_type_define, gig_function_define, etc. This intermediate information is Scheme. I'll borrow the term IL for it. Later, this whole process could be replaced a GIR XML parser.

The runtime will only depend on libguile and libffi at link time, but use GLib and GObject via dlopen. It will load the IL, and use dlopen and libffi to construct all the necessary. This would mean

  • the runtime first requires that the IL has the path to the GLib and GObject *.so to be used. The preprocessor can get this from the require call.
  • the runtime uses dlopen and dlsym to get the set of GLib/GObject functions needed for SCM to C argument conversion
  • the runtime parses the IL, constructing types and functions. For a function, the IL will have the path to the shared object and the C name of the function, and enough info to construct the arg map. For types, since GType numerical values are not safe, the IL would have the GType names instead.

The number of GLib functions required for SCM/C arg conversion stands at 140 in my hacking branch, down from 179 in version 0.3.2.

nm --undefined-only .libs/libguile-gi.so | grep ' g_' | grep -v '_info_' | grep -v 'irepository'

This would require replacement of libgirepository functions

  • Replace GIRepository's invoke functions with libffi using info from the arg map
  • Eliminate GIRepository types used in the type and func system, like GigTypeMeta 's use of GICallableInfo and GIEnumInfo, and GigFunction's use of GIFunctionInfo

@LordYuuma
Copy link
Collaborator Author

LordYuuma commented Mar 18, 2022

I'm not sure whether we can replace the typelibs with their XML counterparts. At the very least, that'd be difficult w.r.t. environment variables. What we could do OTOH is moving from GIRepository's internal representation to the one we actually require (which we could describe in XML, JSON, what have you) by first having a GI-based library write that internally with the rest of our libraries consuming it and then have it generated by a pure C/Scheme implementation of said library. WDYT?

@spk121
Copy link
Owner

spk121 commented Mar 19, 2022

Well this branch I've been playing with https://github.com/spk121/guile-gi/tree/split-parse-runtime is trying to create an intermediate representation in the hopes of splitting the C library in twain. The branch is totally broken at the moment, but, runs enough to generate, and then parse intermediate code like the following

(require "GLib" "2.0" ("libgobject-2.0.so.0" "libglib-2.0.so.0"))
(type "GArray")
(type-info "%GAsciiType" flags ((alnum . 1) (alpha . 2) (cntrl . 4) (digit . 8) (graph . 16) (lower . 32) (print . 64) (punct . 128) (space . 256) (upper . 512) (xdigit . 1024)) ())
(flag-conversion "AsciiType" #f "%GAsciiType")
(type-info "%GBookmarkFileError" enum ((invalid-uri . 0) (invalid-value . 1) (app-not-registered . 2) (uri-not-found . 3) (read . 4) (unknown-encoding . 5) (write . 6) (file-not-found . 7)) ())
(enum-conversion "BookmarkFileError" #f "%GBookmarkFileError")
(type "GByteArray")
($function "byte-array:free" "g_byte_array_free" 
  ((name . "byte-array:free") (s-input-req . 2) (c-input-len . 2) 
    (pdata
     ((name . "array") (meta (arg-type . GByteArray) (flags ptr in) (transfer . nothing) (params ((arg-type . uint8) (item-size . 1) (transfer . nothing)))) (s-direction . input) (tuple . singleton) (presence . required) (i . 0) (c-input-pos . 0) (s-input-pos . 0)) 
     ((name . "free_segment") (meta (arg-type . gboolean) (flags in) (transfer . nothing)) (s-direction . input) (tuple . singleton) (presence . required) (i . 1) (c-input-pos . 1) (s-input-pos . 1))) 
    (return-val (name . "%return") (meta (arg-type . uint8) (flags ptr out) (transfer . nothing)) (s-direction . output) (tuple . singleton) (presence . required) (i . 0))))

@spk121
Copy link
Owner

spk121 commented Mar 26, 2022

OK, at this point, the split-parse-runtime branch has split libguile-gi in twain: a libguile-giparse and a libguile-gi. Anything having to do with gobject-introspection of libgirepository is in the former, removing the dependency on libgirepository on the latter.

At the moment, the parser calls the runtime directly, but, by calling set-il-output-port to some port, you can capture a list of function calls that you could then later feed to the runtime to (theoretically) load all the types and functions without having to link to girepository or parse the typelib.

From here, there are just ~140 GObject/GLib calls remaining on the runtime side. These are all present for SCM-to-C conversion for function arguments, or to do GType-to-SCM class conversion. It should be a rote task to dlopen/dlym those at runtime after loading the user's chosen version of GObject/GLib.

@spk121
Copy link
Owner

spk121 commented Apr 4, 2022

OK. The latest commit at https://github.com/spk121/guile-gi/tree/split-parse-runtime sketches out a solution to this bug. It is very rough, but the outline is all there. It passes most of make check

  • libguile-giparse and (gi parser) can convert typelib into scheme modules using a sort of intermediate language. This library has to link to GIR, GObject, and GLib.
  • the gi-parse guild command can use libguile-giparse to make standard Guile modules for typelib
  • libguile-gi and (gi runtime) can load these scheme modules, parse this intermediate language and make the binding happen at runtime. libguile-gi uses it own FFI and for the hundred GLib/GObject functions it does use, it loads those dynamically from the same version of GObject/GLib associated with the typelib it is loading, and not with a version of GObject/GLib that was present when it was built.

You can still use use-typelibs like in v0.3.2, I think. I'm not 100% sure if using using both the parser and the runtime at the same time -- such as with use-typelibs -- creates the separation that Guix needs, but, I have high hopes that splitting into separate parse and runtime steps should work.

But that tree is a huge mess. It has some ideas I started and later abandoned. I'm going to rewind, rework, and make a sensible patchset in a new branch.

From there, a problems remain

  • see what can be done with the compile time of these huge generated bindings files. Is it slow because of libguile-gi internals? or because the Guile compiler bogs down at a certain size? On my old laptop, compiling the scheme libraries for a full Gtk stack takes 30 minutes to an hour. Loading the full Gtk stack when a script starts up can take 10 seconds.
  • apply more thought to the case of when parent classes of a given class come from different typelib namespaces
  • and the dozens of bugs I probably created along the way

@LordYuuma
Copy link
Collaborator Author

I think we might still be duplicating some work here in that we need to actually read and write data to disk a few times rather than hadling things in memory. The guile language modules provide necessities to build a compile tower. We could hook into that and provide a language specification for gi-scheme, which compiles to either scheme or Tree-IL. This would correspond to what (gi runtime) is currently doing. (gi parser) and gi-parse should probably too sit on that tower with a compilation to gi-scheme being defined. The gi-parse command would then be a simple wrapper around Guile's compile[-file].

You are right in that use-typelibs itself would not provide this separation on its own. However, I hazard a guess that with (gi runtime) being built on just FFI, you could define a build process in which you first generate your necessary module descriptions and then compile everything to .go. That would work in Guix by adding Guile-GI as both native and regular input. As the gi-scheme descriptions themselves are hopefully architecture-independant, we could thus effectively work around that issue.

Long term however, it would be better to bring everything back into one compilation tower, with the gi-parse side implemented in pseudo-pure Scheme.

@spk121
Copy link
Owner

spk121 commented Apr 10, 2022

I think we might still be duplicating some work here in that we need to actually read and write data to disk a few times rather than hadling things in memory. The guile language modules provide necessities to build a compile tower. We could hook into that and provide a language specification for gi-scheme, which compiles to either scheme or Tree-IL. This would correspond to what (gi runtime) is currently doing. (gi parser) and gi-parse should probably too sit on that tower with a compilation to gi-scheme being defined. The gi-parse command would then be a simple wrapper around Guile's compile[-file].

When I experimented, I found that compile-file makes valid .go but saving the output of compile to bytecode does not. So reading/writing to file in multiple steps may be necessary. The idea of using language is intriguing.

You are right in that use-typelibs itself would not provide this separation on its own. However, I hazard a guess that with (gi runtime) being built on just FFI, you could define a build process in which you first generate your necessary module descriptions and then compile everything to .go. That would work in Guix by adding Guile-GI as both native and regular input. As the gi-scheme descriptions themselves are hopefully architecture-independant, we could thus effectively work around that issue.

This makes sense. I wonder if there are 32-bit/64-bit differences in typelib files. I don't know.

Long term however, it would be better to bring everything back into one compilation tower, with the gi-parse side implemented in pseudo-pure Scheme.

One could be quite meta, and use the current gi-parse and (gi parser) to bootstrap a gi-scheme for GIRepository-2.0 and its dependencies, and then reprogram the whole of guile-gi's parser using Guile bindings to GIRepository.

@LordYuuma
Copy link
Collaborator Author

LordYuuma commented Apr 10, 2022

When I experimented, I found that compile-file makes valid .go but saving the output of compile to bytecode does not. So reading/writing to file in multiple steps may be necessary. The idea of using language is intriguing.

Note that comile-file uses the language printer of the target file and passes #:to-file? #t.

You are right in that use-typelibs itself would not provide this separation on its own. However, I hazard a guess that with (gi runtime) being built on just FFI, you could define a build process in which you first generate your necessary module descriptions and then compile everything to .go. That would work in Guix by adding Guile-GI as both native and regular input. As the gi-scheme descriptions themselves are hopefully architecture-independant, we could thus effectively work around that issue.

This makes sense. I wonder if there are 32-bit/64-bit differences in typelib files. I don't know.

They do actually describe their file format [1,2]. Only the endianness appears to make a difference, and in a cross-compiling architecture that ought to be the target endianness.

Long term however, it would be better to bring everything back into one compilation tower, with the gi-parse side implemented in pseudo-pure Scheme.

One could be quite meta, and use the current gi-parse and (gi parser) to bootstrap a gi-scheme for GIRepository-2.0 and its dependencies, and then reprogram the whole of guile-gi's parser using Guile bindings to GIRepository.

I don't quite know how to interpret this. Do you mean we'd only implement enough GI parsing to load GIRepository and then hand things off from there (similar to format, which only supports a smaller number of features until (ice-9 format) is loaded)? If so, I'm unsure if there is such a thing as a convenient, mostly incomplete bootstrap core. I'd rather go with a mostly complete side implementation instead.

But before we're tacking on features upon features, I think it is time to refactor and make what we have currently work in the way we want. This would at the very least also include a lot of (shell) tests for the gi-parse part. Integration tests would also be nice, but I don't think we could put those into CI, can we?

[1] https://developer-old.gnome.org/gi/unstable/gi-GITypelib-Internals.html
[2] https://gnome.pages.gitlab.gnome.org/gobject-introspection/girepository/gi-GITypelib-Internals.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants