Saturday, December 18, 2010

RPATH, RUNPATH and LD_LIBRARY_PATH

Long time since my last post - my bad.

Recently I have been working on a renewed FWTools for linux - hopefully I'll follow up on that later. Much like in the past FWTools is intended to work across a variety of linux systems (ie. Ubuntu, Redhat, Suse, Gentoo) and to not interfere with the native packing system nor to require special permissions to install.

To that end we try to distribute as many of the required shared libraries as possible, and we use the LD_LIBRARY_PATH environment variable to ensure FWTools finds and uses these shared libraries in preference to any system provided versions which might be incompatible (though often aren't).

In my local testing I was getting frustrated as my shared libraries were not being used. Instead the system versions of things like libsqlite3.so.4 were getting picked up from /usr/lib64 despite my LD_LIBRARY_PATH pointing to a copy distributed with FWTools. I check this with the ldd command which shows what sublibraries an executable or shared library need and which is being picked according to the current policy.

I carefully read the man page for ld.so, which is responsible for loading shared libraries at runtime, and it indicated:

The necessary shared libraries needed by the program are searched for
in the following order

o Using the environment variable LD_LIBRARY_PATH
(LD_AOUT_LIBRARY_PATH for a.out programs). Except if the exe‐
cutable is a setuid/setgid binary, in which case it is ignored.
...

Well, my programs are not setuid/setgid nor are they a.out (very old style of linux binary), so what gives? Due to poor time synchronization on my systems and use of nfs, I noticed that some shared library dates were in the future. I thought perhaps some funky security policy prevented them from being used, but that theory did not pan out. I tried doing it on a local file system incase there were some rule about not trusting shared libraries from remote file systems in some cases. Still no go.

However, googling around the net I found that my local ld.so man page was not very comprehensive. Another ld.so man page gave me a hint, referring to DT_RPATH and DT_RUNPATH attributes being used before LD_LIBRARY_PATH. I was vaguely aware of shared libraries having an "rpath" attribute which could be set at link time and might have some sort of influence on searching. Distribution packaging folks have sometimes bugged me about it, especially back in the old days before libtool was used for GDAL. But I had never really understood it well and it just seemed like unnecessary stuff.

Further digging confirmed that ELF (the modern linux object file format) executables and shared libraries may have an RPATH attribute built right into the file. This is essentially a directory that should be searched first for shared libraries that this executable or shared library depends on. RUNPATH is similar, but is apparently evaluated after LD_LIBRARY_PATH. I got my critical information from this blog post by Trevor King.

My old favorite, the ldd command, does not show the rpath settings though it does honour them. I also use nm to dump the symbols in object files and shared libraries, but it doesn't address special attributes like rpath. Digging around I found the readelf command which can report all sorts of information about a shared library or executable, including the RPATH, and RUNPATH. It also helpfully lists the shared libraries that are needed, something that often gets obscured in ldd by the chaining of dependencies.

The rpath attribute is generally set by libtool when linking shared libraries. GDAL includes a build option to build without libtool, and I contemplated using that which would help alot. But it wouldn't help for other libraries in FWTools that were not directly pulled in by GDAL. My goal with the modern FWTools is to - as much as possible - use standard CentOS/RedHat rpm binaries as the basis for the product. So I don't want to have to rebuild everything from scratch. What I really wanted was a way to suppress use of the rpath, switch it to runpath or remove it entirely. Low and behold, I am not the first with this requirement. There is a command called chrpath that can do any of these "in place" to a binary (see chrpath(1) man page).

chrpath -d usr/lib64/libgdal.so

This command isn't installed by default on my system, but it was packaged and it works very nicely. I'll preprocess the shared libraries and binaries to strip out rpath so my LD_LIBRARY_PATH magic works smoothly. I had even considered using it to replace the rpath at install time with the path inside the FWTools distribution for libraries. This, unfortunately, does not appear to be possible as it seems chrpath can only replace rpath's with rpaths of the same or a shorter length. In any event, FWTools still depends on cover scripts to set other environment variables so doing this wouldn't have made it practical to avoid my environment setup magic.

I owe a lot to some other blog and mailing list posts turned up by google. This blog entry is partly an attempt to seed the net with helpful answers for others that run into similar problems. I will say it is very annoying that a company decided to name itself "rPath" - it made searching quite a bit harder!