Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dlsym(RTLD_NEXT) doesn't always work #2

Open
jethrogb opened this issue Sep 9, 2015 · 5 comments
Open

dlsym(RTLD_NEXT) doesn't always work #2

jethrogb opened this issue Sep 9, 2015 · 5 comments

Comments

@jethrogb
Copy link

jethrogb commented Sep 9, 2015

Sometimes you need to do some magic to find the entry point for the real function. Here's an example that uses dl_iterate_phdr when dlsym(RTLD_NEXT,...) fails: https://github.com/jethrogb/ssltrace/blob/bf17c150a7/ssltrace.cpp#L74-L112

@jethrogb jethrogb changed the title dlsym doesn't always work dlsym(RTLD_NEXT) doesn't always work Sep 9, 2015
@geofft
Copy link
Owner

geofft commented Sep 9, 2015

Interesting. Looking at your commit history, this is specifically for when the library you're hooking was opened with RTLD_LOCAL? I am sort of surprised that the LD_PRELOAD even gets used in that case.

If I hack up your test case a bit:

#define _GNU_SOURCE

#include <dlfcn.h>
#include <stdio.h>

__attribute__((__weak__))
void gnutls_handshake(void);

int main(void)
{
        void *handle = dlopen("libgnutls.so",RTLD_LOCAL|RTLD_NOW);
        printf("directly: gnutls_handshake = %p\n", gnutls_handshake);
        void *fp=dlsym(handle,"gnutls_handshake");
        printf("through handle: gnutls_handshake = %p\n", fp);
        fp=dlsym(RTLD_DEFAULT,"gnutls_handshake");
        printf("through RTLD_DEFAULT: gnutls_handshake = %p\n", fp);
        return 0;                                                    
}

I get

directly: gnutls_handshake = (nil)
through handle: gnutls_handshake = 0x7fc32647be80
through RTLD_DEFAULT: gnutls_handshake = (nil)

but with this preload:

#include <stdio.h>

void gnutls_handshake(void) {}

__attribute__((__constructor__))
void ctor(void) {
        fprintf(stderr, "address of preloaded gnutls_handshake is %p\n", gnutls_handshake);
}

I get

address of preloaded gnutls_handshake is 0x7fbb22dbe730
directly: gnutls_handshake = (nil)
through handle: gnutls_handshake = 0x7fbb22524e80
through RTLD_DEFAULT: gnutls_handshake = 0x7fbb22dbe730

In other words, without the LD_PRELOAD, only dlsym through the handle actually works, and with the LD_PRELOAD, it isn't getting called when you do that, so we don't even get to the point of caring about RTLD_NEXT.

What was your failing case? (Is there an actual GnuTLS-using program I can poke at?)

@jethrogb
Copy link
Author

jethrogb commented Sep 9, 2015

Oh I had completely forgotten about that test case. I'm sorry, I don't remember the actual program that led me to write this code. I think it might have to do with program A loading a library B using RTLD_LOCAL, where that library B depends on library C (here C being GnuTLS).

@geofft
Copy link
Owner

geofft commented Sep 18, 2015

OK, I can reproduce this (glibc 2.19, Debian 8.1 x86_64): if I have a library intermediate.so that depends on libgnutls.so, and a preload library preload.so that overrides gnutls_handshake, then if intermediate.so is loaded through RTLD_LOCAL, internal calls to libgnutls.so within intermediate.so will hit the preload, but the preload itself will get a null return from RTLD_NEXT. If intermediate.so is instead loaded through RTLD_GLOBAL, things work.

Oddly enough, directly asking for the address of gnutls_handshake via the handle to intermediate.so (regardless of RTLD_LOCAL vs. RTLD_GLOBAL) does not hit the preload.

$ LD_PRELOAD=./preload.so ./main
0x7ffa172cb750 from preload constructor
0x7ffa172cb750 from intermediate constructor
0x7ffa16830e80 from main through dlsym intermediate.so
0x7ffa172cb750 from intermediate function
(nil) from RTLD_NEXT inside preload
0x7ffa16830e80 from main through dlsym libgnutls.so
$ LD_PRELOAD=./preload.so ./main g
0x7f3edbf1b750 from preload constructor
0x7f3edbf1b750 from intermediate constructor
0x7f3edb480e80 from main through dlsym intermediate.so
0x7f3edbf1b750 from intermediate function
0x7f3edb480e80 from RTLD_NEXT inside preload
0x7f3edb480e80 from main through dlsym libgnutls.so

Source code in this gist.

This smells like a glibc bug. On my FreeBSD 10.2 VM (swapping libgnutls.so and gnutls_handshake for libreadline.so and write_history, since it doesn't have GnuTLS by default):

# env LD_PRELOAD=./preload.so ./main
0x80081f570 from preload constructor
0x80081f570 from intermediate constructor
0x801815010 from main through dlsym intermediate.so
0x80081f570 from intermediate function
0x801815010 from RTLD_NEXT inside preload
0x801815010 from main through dlsym libreadline.so
# env LD_PRELOAD=./preload.so ./main g
0x80081f570 from preload constructor
0x80081f570 from intermediate constructor
0x801815010 from main through dlsym intermediate.so
0x80081f570 from intermediate function
0x801815010 from RTLD_NEXT inside preload
0x801815010 from main through dlsym libreadline.so

I think we should start by reporting this to the glibc maintainers and asking what the intended behavior is. I don't think there's a way to reliably determine in dl_iterate_phdr which of the various loaded libraries is really the next library to call: part of the usefulness of RTLD_LOCAL is to allow you to load multiple libraries that expose symbols with the same name, and get the right one. So I'd rather not go that route unless we have to.

@jethrogb
Copy link
Author

Thanks for investigating this further!

I agree that there is some ambiguity in some cases. The ambiguity arises when multiple libraries that define the same symbol get loaded AND some of those symbols are being linked with the preloaded symbol instead. I think this can happen in the following cases:

  • symbol versioning
  • soname versioning
  • different rpaths

I'll try to come up with test cases for both of these.

Ideally a wrapper function would be able to identify the exact function that is being replaced, but I'm not sure if that's possible.

@jethrogb
Copy link
Author

Sorry, I got distracted by a bug in ld. This gist builds test cases for each of the above scenarios, as well as 7 test binaries. 6 binaries linked to two versions of each scenario and 1 binary that only uses libdl. All binaries are meant to be run with LD_LIBRARY_PATH=..

An interesting test case is for example test-lsoname1 dl,now,libusesymver2.so, in which the dynamic completely ignores the symbol version request of libusesymver2.so. Fun fact: the library with the correct versioned symbol is mapped into memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants