Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continued incorrect rpath fixups for libraries symlinked into $PREFIX/lib #5198

Open
2 tasks done
robertmaynard opened this issue Feb 22, 2024 · 5 comments
Open
2 tasks done
Labels
type::bug describes erroneous operation, use severity::* to classify the type

Comments

@robertmaynard
Copy link

Checklist

  • I added a descriptive title
  • I searched open reports and couldn't find a duplicate

What happened?

Extension of #5179 and #5181

The computed RPATH entires that conda-build generates always uses the concrete library, and not what ever representation that exists in $PREFIX/lib. This is important as $ORIGIN is evaluated without going through symlinks.

So if you have $PREFIX/lib/library.so which symlinks to $PREFIX/lib/nested/deeply/library.so you will compute an RPATH entry of $ORIGIN/../../. When $PREFIX/lib/library.so is loaded it will now search for libraries outside the conda env. With extra manipulation and some luck, you can craft a RPATH entries that will load items from the users /lib ( mainly libgcc_s.so )

Here is a conda-build test based offf of _rpath_symlink that showcases the problem ( https://github.com/robertmaynard/conda-build/tree/add_cuda_toolkit_layout_symlink_test ):

{% set lib_file = "libthing.so.1.0.0" %}  # [linux]
{% set lib_file = "libthing.1.0.0.dylib" %}  # [osx]

package:
  name: rpath_symlink
  version: 1.0.0

build:
  skip: true  # [not (linux or osx)]
  rpaths_patcher: {{ rpaths_patcher }}
  script:
    - mkdir -p "${PREFIX}/lib"
    - mkdir -p "${PREFIX}/lib/sub/folder"
    - >
      < /dev/null ${CC} ${CPPFLAGS} ${CFLAGS} ${LDFLAGS}
      -x c - -nostdlib -s -o "${PREFIX}/lib/sub/folder/{{ lib_file }}" "-Wl,-rpath,${PREFIX}/lib"
      -shared -Wl,-soname,libthing.so.1  # [linux]
      -dynamiclib -install_name libthing.1.dylib  # [osx]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/sub/folder/libthing.so.1"  # [linux]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/sub/folder/libthing.1.dylib"  # [osx]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/libthing.so.1"  # [linux]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/libthing.1.dylib"  # [osx]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/libthing.so"  # [linux]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/libthing.dylib"  # [osx]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/{{ lib_file }}"  # [linux]
    - ln -s "${PREFIX}/lib/sub/folder/{{ lib_file }}" "${PREFIX}/lib/{{ lib_file }}"  # [osx]

requirements:
  build:
    - {{ compiler("c") }}

test:
  requires:
    - py-lief
  commands:
    # Test that we get only a single entry that is the library's own directory.
    - |
      python -c '
      import os, lief
      lib = lief.parse(os.environ["PREFIX"] + "/lib/{{ lib_file }}")
      dynamic_entries = {str(entry) for entry in lib.dynamic_entries}  # [linux]
      assert {"$ORIGIN/."} == {e.rpath for e in lib.dynamic_entries if e.tag == lief.ELF.DYNAMIC_TAGS.RPATH}, "$ORIGIN rpath not found in: " + str(dynamic_entries)  # [linux]
      commands = {str(command) for command in lib.commands}  # [osx]
      assert {"@loader_path/"} == {command.path for command in lib.commands if command.command == lief.MachO.LOAD_COMMAND_TYPES.RPATH}, "loader path not found in: " + str(commands)  # [osx]
      '

I believe that the correct approach to rpath fixup on linux is that we expect the loading of libraries to always occur from $PREFIX/lib. So in that case we need something like a two pass approach:

  1. We fixup all libraries and symlinks in $PREFIX/lib to a RPATH entry of $ORIGIN. We record the fully resolved paths of each of these libraries
  2. We scan for all other libraries not fixed up in the above pass. These we compute the relative path to $PREFIX/lib and use that as our RAPTH entry.

Conda Info

No response

Conda Config

No response

Conda list

No response

Additional Context

No response

@robertmaynard robertmaynard added the type::bug describes erroneous operation, use severity::* to classify the type label Feb 22, 2024
@jakirkham
Copy link
Member

@mbargull would be interested in your thoughts here as you wrestled with this code path recently. How might we make this better?

@mbargull
Copy link
Member

So if you have $PREFIX/lib/library.so which symlinks to $PREFIX/lib/nested/deeply/library.so you will compute an RPATH entry of $ORIGIN/../../. When $PREFIX/lib/library.so is loaded it will now search for libraries outside the conda env.

I'd assume $ORIGIN is resolved to the path of the binary, not the symlink.
Please provide a test case in which a library that is in <path-to-symlink>/../../ but not <path-to-binary>/../../ is actually loaded.

@robertmaynard
Copy link
Author

So if you have $PREFIX/lib/library.so which symlinks to $PREFIX/lib/nested/deeply/library.so you will compute an RPATH entry of $ORIGIN/../../. When $PREFIX/lib/library.so is loaded it will now search for libraries outside the conda env.

I'd assume $ORIGIN is resolved to the path of the binary, not the symlink. Please provide a test case in which a library that is in <path-to-symlink>/../../ but not <path-to-binary>/../../ is actually loaded.

conda-forge/cuda-feedstock#10 ( and conda-forge/cuda-nvtx-feedstock#2 ) has background where incorrect system libraries are loaded due to bad RPATH fixup caused by this issue.

I'd assume $ORIGIN is resolved to the path of the binary, not the symlink.

Unforunately not.

├── lib
│   └── libcublas.so -> ../libcublas.so
├── libcublasLt.so
├── libcublasLt.so.12 -> libcublasLt.so
├── libcublas.so

1 directory, 3 files
rmaynard@rmaynard-dt:~/Work/temp/rpath$ readelf -a libcublas.so | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libcublasLt.so.12]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
rmaynard@rmaynard-dt:~/Work/temp/rpath$ readelf -a libcublas.so | grep RPATH
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN]

rmaynard@rmaynard-dt:~/Work/temp/rpath$ ldd libcublas.so
	linux-vdso.so.1 (0x00007ffc523fe000)
	libcublasLt.so.12 => /home/rmaynard/Work/temp/rpath/./libcublasLt.so.12 (0x00007fb8494c9000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fb849490000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb84946d000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb849467000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb849318000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb8492f3000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb8490ff000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fb86f733000)
rmaynard@rmaynard-dt:~/Work/temp/rpath$ ldd lib/libcublas.so
	linux-vdso.so.1 (0x00007fff2937c000)
	libcublasLt.so.12 => /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcublasLt.so.12 (0x00007fc3cf3d1000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fc3cf3c7000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc3cf3a4000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc3cf39e000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc3cf24f000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fc3cf22a000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc3cf036000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fc3f566a000)

As we can see on the last line the $ORIGIN in libcublas.so is being resolved to the location of the symlink and not the library.

Now if we go into lib/ and construct a symlink we can see how it resolves:

rmaynard@rmaynard-dt:~/Work/temp/rpath$ cd lib/
rmaynard@rmaynard-dt:~/Work/temp/rpath/lib$ ln -s ../libcublasLt.so libcublasLt.so.12
rmaynard@rmaynard-dt:~/Work/temp/rpath/lib$ ldd libcublas.so 
	linux-vdso.so.1 (0x00007ffd47ddd000)
	libcublasLt.so.12 => /home/rmaynard/Work/temp/rpath/lib/./libcublasLt.so.12 (0x00007ff1a7511000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ff1a74d8000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff1a74b5000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ff1a74af000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ff1a7360000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ff1a733b000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff1a7147000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff1cd77b000)
rmaynard@rmaynard-dt:~/Work/temp/rpath/lib$ cd ..
rmaynard@rmaynard-dt:~/Work/temp/rpath$ ldd lib/libcublas.so 
	linux-vdso.so.1 (0x00007ffc89b23000)
	libcublasLt.so.12 => /home/rmaynard/Work/temp/rpath/lib/libcublasLt.so.12 (0x00007f98c78ac000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f98c7873000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f98c7850000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f98c784a000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f98c76fb000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f98c76d6000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f98c74e2000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f98edb16000)

@mbargull
Copy link
Member

Thanks for the details.
I have to admit I was a bit oblivious about the issues we can run into with this -- thanks for making me aware; much appreciated!

We can condense this to a self-contained test like so:

#> mkdir ./bin ./lib ./lib/sub
#> printf %s 'int status() { return 0; }' |
#>   ${CC} -x c - -s -shared -nostdlib -o ./lib/libstatus.so
#> printf %s 'extern int status(); int call() { return status(); }' |
#>   ${CC} -x c - -s -shared -nostdlib '-Wl,-rpath,$ORIGIN/..' -L./lib -lstatus -o ./lib/sub/libcall.so
#> ln -rs ./lib/sub/libcall.so ./lib/
#> printf %s 'extern int call(); int main() { return call(); }' |
#>   ${CC} -x c - -s '-Wl,-rpath,$ORIGIN/../lib' -L./lib -lcall -o ./bin/run
#> ./bin/run && echo ok || echo fail
ok
#> LD_TRACE_LOADED_OBJECTS=1 ./bin/run
	linux-vdso.so.1 =>  (0x00007fff4f1de000)
	libcall.so => /home/conda/work/bin/../lib/libcall.so (0x00007f36b977e000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f36b9000000)
	libstatus.so => /home/conda/work/bin/../lib/libstatus.so (0x00007f36b9776000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f36b9400000)
#> printf %s 'int status() { return 1; }' |
#>   ${CC} -x c - -s -shared -nostdlib -o ./libstatus.so
#> ./bin/run && echo ok || echo fail
fail
#> LD_TRACE_LOADED_OBJECTS=1 ./bin/run
	linux-vdso.so.1 =>  (0x00007fff73364000)
	libcall.so => /home/conda/work/bin/../lib/libcall.so (0x00007f741a081000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f7419a00000)
	libstatus.so => /home/conda/work/bin/../lib/../libstatus.so (0x00007f741a079000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7419e00000)

It would make sense to put in some post processing check for symlinks to libraries and at least warn the user with a message suggesting how to alleviate potential issues, e.g., switch binary and symlink location to have a rpaths go down instead of up in the filesystem or keep the absolute path (i.e., skip binary relocation).

@robertmaynard
Copy link
Author

It would make sense to put in some post processing check for symlinks to libraries and at least warn the user with a message suggesting how to alleviate potential issues,

@mbargull I agree with this! That kind of diagnostic have ensured we avoided issues, and would help in making sure we ( and other packages ) don't generate bad RPATH entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type::bug describes erroneous operation, use severity::* to classify the type
Projects
Status: 🆕 New
Development

No branches or pull requests

3 participants