Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using zig cc to make prebuilt binaries #231

Open
creationix opened this issue Mar 25, 2020 · 27 comments
Open

Consider using zig cc to make prebuilt binaries #231

creationix opened this issue Mar 25, 2020 · 27 comments
Labels

Comments

@creationix
Copy link
Member

https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html

I think this might enable us to build all the prebuilt luvi binaries using a single linux host such as our CI servers.

@rphillips
Copy link
Member

Looks extremely promising.

@squeek502
Copy link
Member

Was thinking about this as well. Would be interesting to try.

@zhaozg
Copy link
Member

zhaozg commented Mar 26, 2020

+1

@squeek502
Copy link
Member

squeek502 commented Mar 26, 2020

Played around with it a bit. Was able to get luv to compile but the resulting LuaJIT fails to load the resulting luv.so.

git clone https://github.com/luvit/luv.git
cd luv
export CC="zig cc"
export ASM="zig cc"
make
./build/luajit tests/run.lua
PANIC: unprotected error in call to Lua API (error loading module 'luv' from file './luv.so':
        ./luv.so: undefined symbol: lua_status
stack traceback:
        [C]: at 0x002998b0
        [C]: in function 'require'
        ./lib/tap.lua:1: in main chunk
        [C]: in function 'require'
        tests/run.lua:6: in main chunk
        [C]: at 0x00218770)

zig cc is super new so I'd guess something from ziglang/zig#4784 is the cause.


EDIT: zig cc-compiled LuaJIT does not include any symbols:

$ nm -gD ./buildgcc/luajit | grep " T " | wc -l
153
$ nm -gD ./build/luajit | grep " T " | wc -l
0

(./buildgcc/luajit is built with gcc, ./build/luajit is built with zig cc)


EDIT#2: My guess is that its the unfinished "passing args to the linker" bullet point, as the end-result LuaJIT compile command ends up being:

/usr/local/bin/zig  cc -O2 -g -DNDEBUG  -Wl,--export-dynamic -rdynamic CMakeFiles/luajit.dir/deps/luajit/src/luajit.c.o CMakeFiles/luajit.dir/deps/luajit/src/ljamalg.c.o CMakeFiles/luajit.dir/lj_vm.S.o  -o luajit  -lm

so I would bet that --export-dynamic is not being passed to the linker.


EDIT#3: Running the above directly gives the output:

warning: unsupported linker arg: --export-dynamic

coming from here, presumably.


EDIT#4: Manually just added --export-dynamic to the linker args in link.cpp just to get past the error above. Now I'm running into a problem with mismatched glibc versions (zig cc is compiling against a newer one than my system glibc; specifically it's trying to use fnctl64 which my glibc doesn't have). Zig has the ability to target specific glibc versions (e.g. -target native-native-gnu.2.23) but I'm not sure it affects zig cc.

This is worth noting because it will potentially affect the portability of zig cc-compiled luvi binaries (right now we use holy-build-box to maximize the portability of our Linux binaries).

@andrewrk
Copy link

andrewrk commented Mar 27, 2020

I'm happy to work with y'all on this, give me a couple days to fix up that flag integration like @squeek502 noted above and then let's coordinate on this use case.

For glibc versions, as long as you know which glibc version you want to target and we add some way to communicate that info to zig cc then it should work. I'm sure we can figure it out.

holy-build-box readme says:

Binaries work on pretty much any glibc-2.12-or-later-based x86 and x86-64 Linux distribution released since approx 2011. A non-exhaustive list:

So you should be able to use -target x86_64-linux-gnu.2.12. That's zig-specific -target syntax, which is actually not supposed to work; I need to change it to be llvm syntax, but before making that change I'll make sure there's a way to specify the glibc version. It seems that two wrongs do make a right after all :)

@squeek502
Copy link
Member

I have a feeling the tough part will be getting OpenSSL to compile. We use a CMake ExternalProject for it and call out to make/nmake and let OpenSSL do its (fairly complicated) thing. Our tiny builds (without OpenSSL and a few other things) will probably be pretty easy to get working by comparison.

@andrewrk
Copy link

One problem with cmake I learned this week is that it does not have a separate concept of a "host compiler" and a "target compiler". So unless the upstream project adds support for it, there will be an issue when a project builds a binary and then tries to run it. A better build system would allow you to choose which binaries were native and which were for the target, so that you could build your native tools natively, and build your target code for the target.

@creationix
Copy link
Member Author

creationix commented Mar 30, 2020

Instead of using holy-build-box libc, I'd prefer if we could do a static build with musl. Then it will work on even more linux versions than our old binaries. I had trouble before trying to build static luajit binaries, but I seem to remember that was exactly one of the examples in @andrewrk's blog post.

@squeek502
Copy link
Member

Was actually able to build a static musl binary of the luvi tiny variant by doing:

export CC="zig cc -target native-native-musl -fno-sanitize=undefined"
export ASM="zig cc"
make tiny
make

(the -fno-sanitize=undefined is to get around Illegal instruction errors from generating vm_arch.h during the LuaJIT build process, see this comment).

However, when running the resulting luvi binary, I'm getting:

PANIC: unprotected error in call to Lua API ([string "return require('init')(...)"]:1: module 'init' not found:

meaning that something is going wrong with the embedding of src/lua/init.lua via CMake here.

@squeek502
Copy link
Member

squeek502 commented Mar 31, 2020

Oh, just remembered that I tried making a musl build of luv/luajit and ran into:

PANIC: unprotected error in call to Lua API (error loading module 'luv' from file './luv.so':
        Dynamic loading not supported

That is, when compiled statically, musl only has a stub for dlopen (reasoning for this can be found in various places, e.g. here, here). This means that fully static builds are probably untenable for Luvi, since dynamically loading modules is definitely necessary.

@SinisterRectus
Copy link
Member

Also https://github.com/phusion/holy-build-box/blob/master/README.md#why-statically-linking-to-glibc-is-a-bad-idea

@creationix
Copy link
Member Author

This means that fully static builds are probably untenable for Luvi, since dynamically loading modules is definitely necessary.

True, anybody needing to load native modules will need this. I wonder if it's worth the effort to have a static build that disables loading native modules. Even ffi use cases often use dlopen to call the libraries.

@andrewrk
Copy link

andrewrk commented Mar 31, 2020

The other option zig cc provides you is "static except glibc" builds, and if you pick an old enough glibc version (e.g. matching holy-build-box) then your binary tarballs will work on any linux distro that uses glibc and standard dynamic linker path (which is most). This is the $arch-linux-gnu target.

The nice thing about static tarballs is they work on all linux distros, even ones like NixOS which have non-standard dynamic linker paths, and alpine linux which uses musl instead of glibc.

@andrewrk
Copy link

andrewrk commented Apr 2, 2020

Alright, I've done some polishing on this feature, and it's about 1 week until the 0.6.0 release. If you're still exploring this use case, now would be a good time to try it out and report issues that you run into. Let me know if I can help.

@squeek502
Copy link
Member

squeek502 commented Apr 3, 2020

Being able to target a specific glibc version would be nice. I'll look into the embedded-Lua file issue mentioned here and try to figure out what's going wrong and how/if it's related to zig cc.

@andrewrk
Copy link

andrewrk commented Apr 3, 2020

Being able to target a specific glibc version would be nice

I'm not planning to solve ziglang/zig#4911 before the release, so you can use this syntax:

-target x86_64-linux-gnu.2.12

@squeek502
Copy link
Member

squeek502 commented Apr 3, 2020

-target x86_64-linux-gnu.2.12 doesn't seem to affect the entire build process:

cd luvi
export CC="zig cc -target x86_64-linux-gnu.2.12 -fno-sanitize=undefined"
export ASM="zig cc"
make tiny
make

results in:

lld: error: undefined symbol: fcntl64
>>> referenced by core.c
>>>               core.c.o:(uv__nonblock_fcntl) in archive deps/luv/deps/libuv/libuv_a.a
>>> referenced by core.c
>>>               core.c.o:(uv__nonblock_fcntl) in archive deps/luv/deps/libuv/libuv_a.a
>>> referenced by core.c
>>>               core.c.o:(uv__cloexec_fcntl) in archive deps/luv/deps/libuv/libuv_a.a
>>> referenced by core.c
>>>               core.c.o:(uv__cloexec_fcntl) in archive deps/luv/deps/libuv/libuv_a.a
>>> referenced by pipe.c
>>>               pipe.c.o:(uv_pipe_open) in archive deps/luv/deps/libuv/libuv_a.a
>>> referenced by process.c
>>>               process.c.o:(uv__process_child_init) in archive deps/luv/deps/libuv/libuv_a.a
>>> referenced by tty.c
>>>               tty.c.o:(uv_tty_init) in archive deps/luv/deps/libuv/libuv_a.a
>>> did you mean: fcntl64
>>> defined in: /home/ryan/.cache/zig/stage1/h/OPkcZeqDw6f_ToblL-yxpuotOgXOhqW3rELarafcu_hTHg9NvBniwDctKo9FnM-I/libc.so.6.0.0

EDIT: Can't seem to make a minimal reproduction:

test.c:

#include <fcntl.h>
#include <stdio.h>

int main() {
    int mode = fcntl(0, F_GETFL);
    printf("mode=%d\n", mode);
    return 0;
}
export CC="zig cc -target x86_64-linux-gnu.2.12"
$CC -o test test.c
./test

works fine and results in mode=33794 being printed

@zhaozg
Copy link
Member

zhaozg commented Apr 6, 2020

I begin to learn "zig cc"

build for macos

export CC="zig cc"
./Configure darwin64-x86_64-cc
make
Zig attempted to find the path to native system libc headers by executing this command:
zig cc -E -Wp,-v -xc /dev/null
Unable to link against libc: Unable to find libc installation: unable to spawn system C compiler
S

but cross build openssl-1.0.2 for windows on macos success with

export CC="zig cc -target x86_64-windows-gnu"
export AR=x86_64-w64-mingw32-ar
./Configure no-shared mingw64
make
fd openssl.exe
apps/openssl.exe
apps/zig-cache/o/Jb6evqm5OTXTMdxGG9d_enVeByLdKBKqY7ynvUa8kdsu3lQ_pCCSjqVxKhbonni-/openssl.exe

great, @andrewrk, and I hope a zig ar

Edit
zig version: 0.5.0+701c03d08

1.1.1f fail

export CC="zig cc -target x86_64-windows-gnu"
export RC="x86_64-w64-mingw32-windres"
export AR="x86_64-w64-mingw32-ar"
./Configure no-shared mingw64
make

${LDCMD:-zig cc -target x86_64-windows-gnu} -m64 -Wa,--noexecstack -Qunused-arguments -Wall -O3 -L.   \
                -o apps/openssl.exe apps/asn1pars.o apps/ca.o apps/ciphers.o apps/cms.o apps/crl.o apps/crl2p7.o apps/dgst.o apps/dhparam.o apps/dsa.o apps/dsaparam.o apps/ec.o apps/ecparam.o apps/enc.o apps/engine.o apps/errstr.o apps/gendsa.o apps/genpkey.o apps/genrsa.o apps/nseq.o apps/ocsp.o apps/openssl.o apps/openssl.res.o apps/passwd.o apps/pkcs12.o apps/pkcs7.o apps/pkcs8.o apps/pkey.o apps/pkeyparam.o apps/pkeyutl.o apps/prime.o apps/rand.o apps/rehash.o apps/req.o apps/rsa.o apps/rsautl.o apps/s_client.o apps/s_server.o apps/s_time.o apps/sess_id.o apps/smime.o apps/speed.o apps/spkac.o apps/srp.o apps/storeutl.o apps/ts.o apps/verify.o apps/version.o apps/x509.o \
                 apps/libapps.a -lssl -lcrypto -lws2_32 -lgdi32 -lcrypt32
Build Dependencies...compiler_rt...Initialize...lld: error: undefined symbol: opt_init
>>> referenced by apps/asn1pars.o:(asn1parse_main)
>>> referenced by apps/ca.o:(ca_main)
>>> referenced by apps/ciphers.o:(ciphers_main)

opt_init should be in apps/libapps.a

@truemedian
Copy link
Member

It's been a while since this has been touched up on, but we can now obtain a run-able (but non-functional) luvi using zig.

Building on linux:

$ export CC="zig cc -target native-linux-gnu.2.28"
$ export CXX="zig c++ -target native-linux-gnu.2.28"
$ export ASM="zig cc -target native-linux-gnu.2.28"
$ make tiny
$ make

Luvi can be run:

$ ./build/luvi --version
./build/luvi v2.11.0-9-gc11154c
rex: 8.37 2015-04-28
libuv: 1.36.0
ssl: OpenSSL 1.1.1g  21 Apr 2020, lua-openssl 0.7.8

But the first test case (samples/test.app) panics with the following error:

$ ./build/luvi samples/test.app -- 1 2 3 4
PANIC: unprotected error in call to Lua API (7)

Which happens on (using lldb) the following traceback:

  * frame #0: 0x000000000027c760 luvi`panic
    frame #1: 0x0000000000264140 luvi`lj_err_throw + 112
    frame #2: 0x000000000029e028 luvi`lj_trace_err + 56
    frame #3: 0x000000000029d4b8 luvi`lj_record_ins + 18232
    frame #4: 0x0000000000292f6b luvi`trace_state + 1627
    frame #5: 0x00000000002f1b40 luvi`lj_vm_cpcall + 77
    frame #6: 0x000000000026c846 luvi`lj_dispatch_ins + 326
    frame #7: 0x00000000002f3500 luvi`lj_vm_inshook + 49
    frame #8: 0x0000000000275d2b luvi`lua_pcall + 155
    frame #9: 0x00000000002463e7 luvi`main + 215
    frame #10: 0x00007ffff7e1b152 libc.so.6`__libc_start_main + 242
    frame #11: 0x000000000023791a luvi`_start at start.S:120

@zhaozg
Copy link
Member

zhaozg commented Dec 1, 2020

I contine todo this with a checklist, after all build pass, I will make a PR to finish this job.

  • toolchain: zig+cmake+make
  • build machine: macos 11
  • zig cc -target x86_64-macos-gnu
  • zig cc -target aarch64-macos-gnu
  • zig cc -target x86_64-windows-gnu zig@7452
  • zig cc -target i386-windows-gnu zig@7452
  • zig cc -target x86_64-linux-gnu.2.17 zig#5882
  • zig cc -target i396-linux-gnu zig#4926
  • zig cc -target aarch64-linux-gnu.2.17 zig#5882

Tips

@andrewrk
Copy link

andrewrk commented Dec 1, 2020

Good news- thanks to @kubkon's efforts, you can soon add aarch64-macos-gnu to the list :)

@zhaozg
Copy link
Member

zhaozg commented Jan 14, 2021

I have build luvi success, but fail to run it.
run lldb -- luvi $LUVI_SRC/sample/test.app
got

{ }
{ args = { [0] = '/Users/zhaozg/work/build/luvit/native/luvi/luvi' },
  bundle = { mainPath = 'main.lua', readfile = function: 0x0100789610,
    readdir = function: 0x0100789560, register = function: 0x01007873a0,
    base = '/Users/zhaozg/work/luvit/luvi/samples/test.app',
    paths = { '/Users/zhaozg/work/luvit/luvi/samples/test.app' }, uvi was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 66737 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.18
    frame #0: 0x00000001001b82f4 luvi`panic(L=0x0000000100763380) at lib_aux.c:313:19 [opt]
   310
   311  static int panic(lua_State *L)
   312  {
-> 313    const char *s = lua_tostring(L, -1);
   314    fputs("PANIC: unprotected error in call to Lua API (", stderr);
   315    fputs(s ? s : "?", stderr);
   316    fputc(')', stderr); fputc('\n', stderr);
Target 0: (luvi) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.18
  * frame #0: 0x00000001001b82f4 luvi`panic(L=0x0000000100763380) at lib_aux.c:313:19 [opt]
(lldb)

and sometimes got

(lldb) r
There is a running process, kill it and restart?: [Y/n]
Process 68446 exited with status = 9 (0x00000009)
1 location added to breakpoint 1
warning: (x86_64) /Users/zhaozg/work/build/luvit/native/luvi/luvi(0x0000000100000000) address 0x0000000100000000 maps to more than one section: luvi.__TEXT and luvi.__TEXT
warning: (x86_64) /Users/zhaozg/work/build/luvit/native/luvi/luvi(0x0000000100000000) address 0x000000010042d000 maps to more than one section: luvi.__DATA and luvi.__DATA
1 location added to breakpoint 1
Process 12501 launched: '/Users/zhaozg/work/build/luvit/native/luvi/luvi' (x86_64)
1 location added to breakpoint 1
warning: (x86_64) /Users/zhaozg/work/build/luvit/native/luvi/luvi(0x0000000100000000) address 0x0000000100000000 maps to more than one section: luvi.__TEXT and luvi.__TEXT
warning: (x86_64) /Users/zhaozg/work/build/luvit/native/luvi/luvi(0x0000000100000000) address 0x000000010042d000 maps to more than one section: luvi.__DATA and luvi.__DATA
1 location added to breakpoint 1
{ }
{ args = { [0] = '/Users/zhaozg/work/build/luvit/native/luvi/luvi' },
  bundle = { stat = function: 0x01007894c0, register = function: 0x01007873e8,
    paths = { '/Users/zhaozg/work/luvit/luvi/samples/test.app' },
    readfile = function: 0x01007895f0,
    base = '/Users/zhaozg/work/luvit/luvi/samples/test.app',
    readdir = function: 0x0100789540, uvi was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 12501 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.12
    frame #0: 0x00000001001b82f4 luvi`panic(L=0x0000000100763380) at lib_aux.c:313:19 [opt]
   310
   311  static int panic(lua_State *L)
   312  {
-> 313    const char *s = lua_tostring(L, -1);
   314    fputs("PANIC: unprotected error in call to Lua API (", stderr);
   315    fputs(s ? s : "?", stderr);
   316    fputc(')', stderr); fputc('\n', stderr);
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.12
  * frame #0: 0x00000001001b82f4 luvi`panic(L=0x0000000100763380) at lib_aux.c:313:19 [opt]
    frame #1: 0x000000010019fb1f luvi`lj_err_throw(L=0x0000000100763380, errcode=<unavailable>) at lj_err.c:523:5 [opt]
    frame #2: 0x0000000100211537 luvi`asm_fuseahuref [inlined] ra_alloc1(as=0x0000000100763548, ref=<unavailable>, allow=<unavailable>) at lj_asm.c:664:24 [opt]
    frame #3: 0x0000000100211526 luvi`asm_fuseahuref(as=0x0000000100763548, ref=<unavailable>, allow=<unavailable>) at lj_asm_x86.h:194 [opt]
(lldb)

I have no ideas, and just guest not compat with JIT feature

@truemedian
Copy link
Member

My latest attempt at building luvi.

TLDR; I now have luvi building for x86_64-linux-gnu on linux.

Successes

  • x86_64-linux-gnu.2.28 (tiny): working luvi, passes all tests
  • x86_64-linux-gnu.2.28 (regular): working luvi, passes all tests

Bulding

Using zig version: 0.8.0-dev.945+166e9ea68

Setup

zig-cc: zig cc -target $ZIG_TARGET $@

zig-asm: zig cc -target $ZIG_TARGET $@

zig-cxx: zig c++ -target $ZIG_TARGET $@

export CC="$(pwd)/zig-cc"
export CXX="$(pwd)/zig-cxx"
export ASM="$(pwd)/zig-asm"

CMAKE_FLAGS: -H. -Bbuild -DCMAKE_BUILD_TYPE=Release -DWithSharedLibluv=OFF -DCMAKE_C_COMPILER=$CC -DCMAKE_CXX_COMPILER=$CXX -DCMAKE_ASM_COMPILER=$ASM

make tiny "CMAKE_FLAGS=$CMAKE_FLAGS"
make
make test

Targets

Much Bug

  • mips64el-linux-gnuabi64: zig: unable to build C object: FileNotFound

  • mips64el-linux-gnuabin32: ldd: testCCompiler.o incompatible with elf64ltsmip

  • mips64-linux-gnuabi64: zig: unable to build C object: FileNotFound

  • mips64-linux-gnuabin32: ldd: testCCompiler.o incompatible with elf64btsmip

  • mipsel-linux-gnu: zig: unable to build C object: FileNotFound

  • mips-linux-gnu: zig: unable to build C object: FileNotFound

  • powerpc64le-linux-gnu: 'sysdeps/unix/sysv/linux/powerpc/sysdep.h' not found

  • powerpc64-linux-gnu: 'sysdeps/unix/sysv/linux/powerpc/sysdep.h' not found

  • powerpc-linux-gnu: zig: unknown target CPU 'ppc32'

  • x86_64-linux-gnux32: libunwind: UnwindCursor<> does not fit in unw_cursor_t

Less Bug

Some of these are probably caused by a lack of system support (see emulation)

  • aarch64_be-linux-gnu: ldd: unknown emulation: aarch64_be_linux

  • aarch64_be-windows-gnu: zig mingw: only win32 supported

  • aarch64-linux-gnu: dlopen not found

  • aarch64-windows-gnu: '../bsd_private_base.h' not found

  • armeb-linux-gnueabi: ldd: unknown emulation: armebelf_linux_eabi

  • armeb-linux-gnueabihf: ldd: unknown emulation: armebelf_linux_eabi

  • armeb-windows-gnu: cmake: backend data layout mismatch target description

  • arm-linux-gnueabi: 'arm-features.h' not found

  • arm-linux-gnueabihf: 'arm-features.h' not found

  • arm-windows-gnu: cmake: backend data layout mismatch target description

  • i386-linux-gnu: ldd: undefined hidden symbol: __x86.get_pc_thunk.bx

  • i386-windows-gnu: dlopen not found

  • x86_64-linux-gnu: versioned fnctl64: use gnu.2.28

  • x86_64-windows-gnu: dlopen not found

  • x86_64-macos-gnu: dlopen not found

The biggest issue for major systems right now is linking dlopen

@creationix
Copy link
Member Author

Would a version that disables dlopen be useful? It would break loading native modules as well as dynamic linking to libraries using ffi, but we would still have access to all the built-in native code and can still use ffi for ctypes type stuff.

@truemedian
Copy link
Member

Seen above, @zhaozg did get cross compiling to work for most of the systems we care about, but I assume thats after modifying the build system to make proper assumptions about cross compilation. Not linking dlopen is actually an issue because of how luvi loads itself, its compiled as a dynamic library which loads itself, which is how the precompiled luajit stuff works (init, luvibundle, luvipath are all loaded this way). If we were to link without dlopen, those would no longer work and everything would break.

@creationix
Copy link
Member Author

Bleh, I forgot about that part. At least we're making good progress.

@zhaozg
Copy link
Member

zhaozg commented Feb 27, 2021

Seen above, @zhaozg did get cross compiling to work for most of the systems we care about.

That list is obsolete, some build broken or output run crash, I need recheck it, and we should give more time to wait zig grows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants