Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When linking statically, native libraries should go inside --start-group/--end-group for robustness #76992

Open
joshtriplett opened this issue Sep 21, 2020 · 16 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug.

Comments

@joshtriplett
Copy link
Member

When rustc links rlib libraries, it puts all the libraries inside a --start-group/--end-group pair, causing the linker to resolve backreferences from a library later in the command line to one earlier in the command line.

However, when rustc links native libraries, it puts libraries on the command line in whatever order it encountered them in code.

With shared linking, that works fine. But with static linking, the order has a crucial semantic significance: symbols in a library will get thrown away unless a library listed previously on the command line has a corresponding unresolved symbol.

Rust #[link] directives don't account for this, and crates in general don't worry about link order, precisely because dynamic linking is the default. When linking statically to multiple libraries, where one depends on another, this is very likely to result in symbol resolution failures at link time. In addition, if libraries have circular references (such as between glibc and libgcc), there is no order that will allow the libraries to link without duplicating at least one library, which seems like something #[link] directives should not need to account for.

I would propose that when linking statically, Rust should always put all libraries, both rlibs and native libraries, inside one large --start-group/--end-group pair, which will allow the linker to handle symbol references both backwards and forwards, as well as circular symbol dependencies.

@joshtriplett joshtriplett added C-bug Category: This is a bug. A-linkage Area: linking into static, shared libraries and binaries labels Sep 21, 2020
@petrochenkov
Copy link
Contributor

petrochenkov commented Sep 21, 2020

Using this option has a significant performance cost. It is best to use it only when there are unavoidable circular references between two or more archives.

https://sourceware.org/binutils/docs/ld/Options.html

Circular dependencies are not a common case, and it's better to pass the libraries with circular dependencies multiple times like -lfoo -lbar -lfoo, which would do the same thing as --start-group / --end-group, but explicitly and more precisely.

@petrochenkov
Copy link
Contributor

I guess this issue is motivated by linking libc and libgcc or something.
[NEEDS VERIFICATION] If I remember correctly, gcc uses the grouping directives for them, but clang emits a -lfoo -lbar -lfoo-style sequence.

@joshtriplett
Copy link
Member Author

joshtriplett commented Sep 21, 2020

@petrochenkov The issue is that most crates don't account for static linking at all, and while #[link does emit multiple library links when specified multiple times, that's isn't specified, and isn't particularly intuitive behavior. People are used to the behavior of dynamically linked libraries, which handle reverse/circular dependencies automatically. (And lld already handles circular dependencies automatically, without needing any special option to do so.)

I've tested linking with and without --start-group/--end-group, and as far as I can tell, "significant performance cost" is not accurate. I haven't measured any performance difference.

Based on that, I think it'd be reasonable to do this by default, to make static linking more user-friendly and less fiddly. And to be clear, I'm only suggesting emitting this when linking statically.

(It's possible to make libc/libgcc/libgcc_eh work without this. This isn't on the critical path to support static linking of glibc. I'm proposing doing it to improve static linking in general.)

@mati865
Copy link
Contributor

mati865 commented Sep 21, 2020

It could have been "significant performance cost" 2 decades ago.
Rust already puts 13 libraries into --start-group/--end-group:

List of the libraries
"-Wl,--start-group"
"-Wl,-Bstatic"
libstd-<hash>.rlib"
libpanic_unwind-<hash>.rlib"
libobject-<hash>.rlib"
libaddr2line-<hash>.rlib"
libgimli-<hash>.rlib"
librustc_demangle-<hash>.rlib"
libhashbrown-<hash>.rlib"
librustc_std_workspace_alloc-<hash>.rlib"
libunwind-<hash>.rlib"
libcfg_if-<hash>.rlib"
liblibc-<hash>.rlib"
liballoc-<hash>.rlib"
librustc_std_workspace_core-<hash>.rlib"
libcore-<hash>.rlib"
"-Wl,--end-group"

I'm wondering if this is observable slowdown.

@petrochenkov
Copy link
Contributor

I guess someone needs to check the perf impact of wrapping everything into a group with BFD linkers (for both ELF and COFF) in common scenarios like "no cyclic dependencies" or "one cyclic dependency between libc and libgcc at the end of a long library list".

Functionally this should be ok in practice since LLD is always doing this implicitly and reports no issues.

@petrochenkov
Copy link
Contributor

petrochenkov commented Sep 21, 2020

Also, I have no idea about grouping in macOS linkers.

(Other linkers that we support are wasm-ld which is lld, and PTX linker which is a Rust project.)

@mati865
Copy link
Contributor

mati865 commented Sep 21, 2020

The issue with LLD approach is visible when you are using 32-bit LLD binary on Windows. It has hard time linking huge projects like LLVM.
But yeah, after quick look in the code --{start,end}-group is ignored by MinGW and ELF drivers.
Other drivers seem to not accept --{start,end}-group.

On Linux we should test BFD and GOLD and on Windows just BFD.

@mati865
Copy link
Contributor

mati865 commented Sep 21, 2020

I ran rustc hello.rs -C save-temps -Z print-link-args and put that output looped 200 times in 2 scripts, one of them was modified to so it had no "-Wl,--start-group" and "-Wl,--ends-group".

My results:

  • Ubuntu 20.04 VM:
    • LLD: ~23s, no measurable difference between the scripts
    • BFD: ~30s, no measurable difference between the scripts
    • GOLD: ~22s, no measurable difference between the scripts
  • Windows 10, MSYS2:
    • BFD: ~63s, no measurable difference between the scripts
    • LLD: ~25s, no measurable difference between the scripts

This was the ideal case, all libs were in the proper order. Apparently there is no overhead of --{start,end}-group in this case.
I've shuffled libraries in the script (confirmed it to fail without --{start,end}-group) and it still gives me ~63s on Windows.

I think we would need huge project with few dozen libs in --{start,end}-group to observe overhead on modern machine in real world example.

@joshtriplett
Copy link
Member Author

@mati865 Thanks for getting numbers on this. That sounds like a pretty compelling argument that there's no longer any measurable performance loss.

I think it'd be a good idea to make this change, to simplify static linking.

@alexcrichton
Copy link
Member

This came up tangentially in the Cargo team meeting today and I wanted to leave a comment here with my thoughts. If --start-group and --end-group don't actually have much of a performance hit I think it'd be awesome if we could just pass them unconditionally. We struggled for the longest time to get the link order just right and we still have weird bugs show up where the answer is "oh just swap those two lines", and I'm not really sure that anyone benefits from actually swapping the two lines.

Basically AFAIK this is just a weird artifact of behavior from GNU LD which doesn't really make much sense in the modern era and is purely just a pain to work around. If that's actually the case it'd be cool if we could just always pass these options!

@mati865
Copy link
Contributor

mati865 commented Nov 18, 2020

Still it was very small "hello world" example. IMO we would also have to prove there is no big difference for projects with dozens of libraries. Who knows, maybe the slowdown will be exponential for BFD?
LLD does always use --{start,end}-groups (even when not specified) but it was designed with that in mind and there is no performance hit here.

@joshtriplett
Copy link
Member Author

I've tested projects with a dozen libraries and gotten similar results.

@joshtriplett
Copy link
Member Author

I recently ran into a new issue that would have been fixed by this. On aarch64, the LSE outline-atomics symbols recently added to compiler-builtins have a dependency on the getauxval function, but when rustc links compiler-builtins, it does so after the C library that provides getauxval. This causes:

/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/bin/ld: /home/josh/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-musl/lib/libcompiler_builtins-b4d3ea1e1230b563.rlib(cpu_model.o): in function `init_have_lse_atomics':
/cargo/registry/src/github.com-1ecc6299db9ec823/compiler_builtins-0.1.41/./lib/builtins/cpu_model.c:778: undefined reference to `getauxval'

Moving the link of the compiler_builtins rlib to inside the start-group/end-group fixes this error.

Given that, I think we should switch to using start-group/end-group by default.

@mati865
Copy link
Contributor

mati865 commented May 21, 2021

I still have it on my long TODO but it's rather low so I won't mind somebody beating it to me.

@joshtriplett
Copy link
Member Author

Currently giving this a shot in #85805

@ehuss
Copy link
Contributor

ehuss commented Jul 19, 2023

FWIW, I'm still running into getauxval issues on aarch64 as noted above. #89626 (comment) contains my investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

5 participants