Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add glibc versioned symbols #2509

Closed
wants to merge 3 commits into from

Conversation

LemonBoy
Copy link
Contributor

@LemonBoy LemonBoy commented May 16, 2019

Zig's way to link against glibc is quite smart: a decoy .so is created that contains all the symbols that a proper libc6.so would export so that the linker is happy.

But this approach has a big downside, the symbols references created this way are unversioned!

Glibc guarantees a stable ABI by exporting versioned symbols and whenever the interface is broken a new one is exported with an increased version number: this way the old applications are still working as intended since they specify what ABI version they expect and the new ones are free to use the improved and shiny stuff.

But what happens when the dynamic linker has to resolve an unversioned reference? Well, let me quote Ulrich Drepper here:

In case only the object file with the reference does not use
versioning but the object with the definition does, then the reference
only matches the base definition. The base definition is the one with
index numbers 1 and 2 (1 is the unspecified name, 2 is the name given
later to the baseline of symbols once the library started using symbol
versioning).

In other words, since the slots are assigned from 2 to ∞ (0 and 1 are reserved) in growing order of version (oldest to newest), we always end up picking the oldest symbol.

Why should you care? Because sometimes the old symbol is not what you want.

Let's consider the spawnThread function in os.zig: it does create a new thread using pthread_create and also sets the stack address for it by setting the corresponding pthread_attr_t.

Calling the old symbol means that we end up running this code:

{
  /* The ATTR attribute is not really of type `pthread_attr_t *'.  It has
     the old size and access to the new members might crash the program.
     We convert the struct now.  */
  struct pthread_attr new_attr;

  if (attr != NULL)
    {
      struct pthread_attr *iattr = (struct pthread_attr *) attr;
      size_t ps = __getpagesize ();

      /* Copy values from the user-provided attributes.  */
      new_attr.schedparam = iattr->schedparam;
      new_attr.schedpolicy = iattr->schedpolicy;
      new_attr.flags = iattr->flags;

      /* Fill in default values for the fields not present in the old
	 implementation.  */
      new_attr.guardsize = ps;
      new_attr.stackaddr = NULL;
      new_attr.stacksize = 0;
      new_attr.cpuset = NULL;

      /* We will pass this value on to the real implementation.  */
      attr = (pthread_attr_t *) &new_attr;
    }

  return __pthread_create_2_1 (newthread, attr, start_routine, arg);

As you can see the legacy code path kindly erases the stackaddr and stacksize values we set but does not care about clearing any unknown bit in flags, such as the one regarding the two parameter it had just erased, leading to a nasty crash in __pthread_create_2_1.

The patch itself is pretty simple, the new files you see have been autogenerated using a simple Python script from glibc's .abilist files, and the resulting files look similar to the ones I have in /lib/i386-linux-gnu/ from a objdump -T point of view.

This also means that we must decide what glibc version we want to impersonate. I've adopted a simple strategy:

  • Export default symbols that are not newer than GLIBC 2.17 so that the programs compiled by Zig can be easily run on older systems such as Centos 7.
  • For compatibility sake with object files linked against newer glibcs we export the newer symbols as non-default.

Sorry for the wall of text 🎉

@andrewrk
Copy link
Member

Thanks for the writeup! I'll try to catch up on PRs tomorrow

@andrewrk
Copy link
Member

OK I studied this PR and have come to the following conclusions:

  • we definitely need glibc versioned symbols
  • supporting --version-map is a good idea which is welcome to be done separately regardless of the outcome of this PR (and needs to be in the --help text, and zig build)
  • I do want those tools to be written in zig itself. I'm willing to do that work but it's gonna happen before merging.
  • I want to consider if these improvements can be made. I don't have solutions at this time but I'd like to mull it over in my mind and maybe brainstorm with people about it.
    • maybe there is some way to ship fewer bytes
    • I do want to support symbol versioning for all the libcs listed in zig targets

@andrewrk
Copy link
Member

andrewrk commented Jul 7, 2019

See #2847

@andrewrk andrewrk closed this Jul 7, 2019
andrewrk added a commit that referenced this pull request Jul 7, 2019
This is the beginning of supporting minimum GLIBC version as part of the
target. See #2509 for the motivation.

The dummy libc zig files are removed. A future commit will build them
on-the-fly, using the generated text files generated by the new tool,
which are checked into source control and distributed along with zig.

These generated text files are, together, 142KB (20KB gzipped).
Compare that to a naive bundling of the .abilist files, which would be
2.2MiB (375KB gzipped).

This is based on glibc 2.29.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants