Statically linking on macOS

March 2, 2022

I was expirimenting with zlib, and I wanted to statically link against the version of the zlib library I built. I was having some issues and I ran across this gist by a Charles Francoise.

It was informative and helpful.

I’m sharing it here as a mirror, in case he deletes it in the future. All credit goes to Charles (Github).

ld – Wading through Mac OS X linker hell

Intro

Friend: I tried looking at static linking in Mac OS X and it seems nearly impossible. Take a look at this http://stackoverflow.com/a/3801032

Me: I have no idea what that -static flag does, but I’m pretty sure that’s not how you link to a library. Let me RTFM a bit.

Minutes later…

Me: I’m gonna have to write this stuff down.

Reading the f…antastic manuals

Static linking vs. dynamic linking

From Wikipedia:

A static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable.

A dynamically linked library is a library intended for dynamic linking. Only a minimum amount of work is done by the linker when the executable file is created; it only records what library routines the program needs and the index names or numbers of the routines in the library. The majority of the work of linking is done at the time the application is loaded (load time) or during execution (run time). Usually, the necessary linking program, called a “dynamic linker” or “linking loader”, is actually part of the underlying operating system.

clang

First things first, gcc isn’t the default compiler in Mac OS X anymore. Since Xcode 5, the Apple developer toolchain uses clang, and gcc only aliases to clang.

NOTE: All shell outputs in this document were produced with the default bash on Mac OS X El Capitan 10.11.3 with Xcode 7.2 installed.

$ gcc -v
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.3.0
Thread model: posix

So let’s look into clang.

$ man clang

clang is a C, C++, and Objective-C compiler which encompasses preprocessing,
parsing, optimization, code generation, assembly, and linking. Depending on
which high-level mode setting is passed, Clang will stop before doing a full
link. While Clang is highly integrated, it is important to understand the stages
of compilation, to understand how to invoke it.

So static linking is an option of the linker stage of compilation. So let’s try ld, the Mac OS X linker’s manual.

ld

$ man ld

OPTIONS
   Options that control the kind of output
     -execute    The default. Produce a mach-o main executable that has file
                 type MH_EXECUTE.

     -dylib      Produce a mach-o shared library that has file type MH_DYLIB.

     -bundle     Produce a mach-o bundle that has file type MH_BUNDLE.

     -dynamic    The default. Implied by -dylib, -bundle, or -execute

     -static     Produces a mach-o file that does not use the dyld. Only used
                 building the kernel.

Now we have it. -static does not control how the output links to libraries; it controls the type of output produced by the linker. In this case, -static is used to indicate that no dynamic linking should occur with this binary. Ever. The only file to ever need this option is the kernel.

So how do we tell the linker which library to link against?

$ man ld

Options that control libraries
  -lx         This option tells the linker to search for libx.dylib or libx.a in
              the library search path. If string x is of the form y.o, then that
              file is searched for in the same places, but without prepending
              `lib' or appending `.a' or `.dylib' to the filename.

  -Ldir       Add dir to the list of directories in which to search for
              libraries. Directories specified with -L are searched in the order
              they appear on the command line and before the default search
              path. In Xcode4 and later, there can be a space between the -L and
              directory.

These are the only two options you need to know to link to most (static or dynamic) libraries. More can be said about lazy linking, frameworks, and many other aspects of linking and libraries but these topics are outside of the scope of this post.

-l is used to tell the linker the name of the libraries to look into for symbols. If you decide to compile a program that uses functions from libpng, you would add -lpng to the arguments passed to clang.

-L is used to tell the linker where to look for the library files. ld maintains a list of directories to search for a library to use. The default library search path is /usr/lib then /usr/local/lib. The -L option will add a new library search path.

At this point is it important to point out that, if the linker finds both a .a (static) and .dylib (dynamic) file in the search paths, it will always choose the dynamic library.

Example

UNIX man pages are infamous for being dry, to the point and absolutely impossible to read for first-timers. Let’s try an example to see how all this fits together.

Basic libcurl program

We’ll be using a very simple C program that uses libcurl to retrieve the content at http://google.com.

#include <curl/curl.h>
#include <stdio.h>

int main(int argc, char** argv) {
    CURL *curl;
    CURLcode res;

    curl = curl_easy_init();

    if (curl) {
        curl_easy_setopt(curl, CURLOPT_URL, "http://google.com");
        curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
        res = curl_easy_perform(curl);

        if(res != CURLE_OK)
            fprintf(stderr, "curl_easy_perform() failed: %s\n",
                curl_easy_strerror(res));

        curl_easy_cleanup(curl);
    }

    return 0;
}

Dynamically linking against default libcurl

libcurl is installed by default with Mac OS X, so we don’t need to provide headers or libraries to compile this program.

$ ls -1 /usr/include/curl
curl.h
curlbuild.h
curlrules.h
curlver.h
easy.h
mprintf.h
multi.h
stdcheaders.h
typecheck-gcc.h

$ ls -1 /usr/lib/libcurl*
/usr/lib/libcurl.3.dylib
/usr/lib/libcurl.4.dylib
/usr/lib/libcurl.dylib

So let’s try to compile our program!

$ clang -o curl_example-system curl_example.c

OH NO! We get a bunch of “Undefined symbols” errors. Why?

We simply forgot to specify that we wanted to link against libcurl at the linker stage. Let’s add the suitable -l option.

$ clang -o curl_example-system -lcurl curl_example.c
$ ./curl_example-system
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="fr"
><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop=
"image"><title>Google</title>...

Great! We have a working program that links against libcurl.

So did we link dynamically or statically? When we inspected the contents of /usr/lib, we only found .dylib files, so we can expect that this executable links against dynamically. We can confirm this by using otool.

otool is a command-line tool to display different parts of object files. In our case, we want to see which libraries the executable links against, so we use the -L option.

$ otool -L curl_example-system
curl_example-system:
    /usr/lib/libcurl.4.dylib (compatibility version 7.0.0, current version 8.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

We can see our executable links against libcurl and libSystem.

That went pretty smoothly, so let’s see what happens when we try to link against a custom build of libcurl

Dynamically linking against custom libcurl

First of all, let’s build our own version of curl.

$ git clone https://github.com/curl/curl.git
$ cd curl
$ ./buildconf
$ ./configure
$ make

Supposing everything went smoothly, you now have your own freshly brewed curl and libcurl.

$ cd ..
$ ls -1 curl/lib/.libs
libcurl.4.dylib
libcurl.a
libcurl.dylib
...

So inside curl/lib/.libs we have our libraries, both static and dynamic. Let’s try compiling our program linking against our brand new libcurl. We’ll have to give the linker the path to our new libraries using the -L option.

$ clang -o curl_example-dynamic -Lcurl/lib/.libs/ -lcurl curl_example.c

SUCCESS! We have a new executable. Let’s see which libraries this one links against.

$ otool -L curl_example-dynamic
curl_example-dynamic:
    /usr/local/lib/libcurl.4.dylib (compatibility version 9.0.0, current version 9.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

We can definitely see that our new executable does not link against the system libcurl but where does that /usr/local/lib path come from?

To answer this question, we need to inspect our brand new libcurl.

$ otool -L curl/lib/.libs/libcurl.dylib
curl/lib/.libs/libcurl.dylib:
    /usr/local/lib/libcurl.4.dylib (compatibility version 9.0.0, current version 9.0.0)
    /usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
    /System/Library/Frameworks/LDAP.framework/Versions/A/LDAP (compatibility version 1.0.0, current version 2.4.0)
    /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

That 1st line looks familiar… It’s exactly the same as our executable! Actually, on Mac OS X, a dynamic library knows where it’s “expected” to be installed and uses that path as an identifier. When the linker created our executable, it looked into our libcurl.dylib and used the “install name”. So let’s run our new executable!

$ ./curl_example-dynamic
dyld: Library not loaded: /usr/local/lib/libcurl.4.dylib
  Referenced from: /Users/chrales/Desktop/./curl_example-dynamic
  Reason: Incompatible library version: curl_example-dynamic requires version 9.0.0 or later, but libcurl.4.dylib provides version 7.0.0
Trace/BPT trap: 5

What? Why?? NO! What happened here?

These types of errors are actually a consequence of “dynamic linking”. Remember that dynamic linking requires a program to link symbols at run-time. In Mac OS X, the dynamic linker is a part of the OS. We can still get some insight from the manual pages.

man dyld reveals a number of environment variables that can be used to affect the way symbols are dynamically linked. This one is of particular interest for us.

DYLD_PRINT_LIBRARIES
       When  this  is set, the dynamic linker writes to file descriptor 2
       (normally standard error) the filenames of the libraries the program is
       using.  This is useful to make sure that the use of DYLD_LIBRARY_PATH is
       getting what you want.

Seems like a great way to track what’s going on with our dynamic linking, and diagnose our previous error.

$ DYLD_PRINT_LIBRARIES=1 ./curl_example-dynamic
dyld: loaded: /Users/chrales/Desktop/curl_example/./curl_example-dynamic
dyld: loaded: /usr/lib/libcurl.4.dylib
dyld: unloaded: /Users/chrales/Desktop/curl_example/./curl_example-dynamic
dyld: Library not loaded: /usr/local/lib/libcurl.4.dylib
  Referenced from: /Users/chrales/Desktop/curl_example/./curl_example-dynamic
  Reason: Incompatible library version: curl_example-dynamic requires version 9.0.0 or later, but libcurl.4.dylib provides version 7.0.0
Trace/BPT trap: 5

That 2nd line tells it all. Since the dynamic linker did not find libcurl at the expected install path, it fell back to the system library, which is a different (and incompatible) version than the one used when building our executable.

How do we tell the dynamic linker to use our custom library? This is also solved by using an environment variable.

DYLD_LIBRARY_PATH
       This is a colon separated list of directories that contain libraries. The
       dynamic linker searches these directories before it searches the default
       locations for libraries. It allows you to test new versions of existing
       libraries.

Seems perfect for the job… and it is!

$ DYLD_LIBRARY_PATH=curl/lib/.libs ./curl_example-dynamic
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="fr"
><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop=
"image"><title>Google</title>...

In the future, should we decide to keep our own libcurl and use it, we could either install it to the expected path, or change the install name to reflect where our library will be located. This is a mildly annoying, but very powerful feature of Mac OS’s dynamic linker that we can leverage to distribute custom versions of libraries with our software, even when those libraries are already installed on the system.

Statically linking against custom libcurl

Dynamic linking and static linking both have distinct advantages and disadvantages. In our case, statically linking against a particular build of libcurl could be useful if we want to be absolutely sure no other version of libcurl – especially the system version – to be used when running our program, or to distribute our program without having to distribute our libcurl along with it.

We know that the linker will choose the dynamic library rather than the static library whenever it finds it on the search paths. For this example, we’ll have to remove the dylib files from the build path.

rm curl/lib/.libs/*.dylib

Let’s try compiling.

$ clang -o curl_example-static -Lcurl/lib/.libs/ -lcurl curl_example.c
Undefined symbols for architecture x86_64:
  "_ASN1_INTEGER_get", referenced from:
      _ossl_connect_common in libcurl.a(libcurl_la-openssl.o)
  "_ASN1_STRING_data", referenced from:
      _ossl_connect_common in libcurl.a(libcurl_la-openssl.o)
  "_ASN1_STRING_length", referenced from:
      _ossl_connect_common in libcurl.a(libcurl_la-openssl.o)
...

That’s quite a large amount of undefined symbols! Why did this happen?

Remember when we used otool to display all the libraries our libcurl.dylib linked against? Remember libssl, libcrypto, libz and the LDAP framework were all mentioned? That’s where the missing symbols are. By linking staticallly against libcurl, we copy all the symbols from libcurl into our executable file, but we still need to link (statically or dynamically) against the symbols from these other libraries. However, since we didn’t have any problems running our dynamically linked version of curl_example, it means that the dynamic libraries can be found somewhere on the system. We can just tell the linker to link dynamically to the system libraries to find the missing symbols.

Remember that our libcurl links against libssl and libcrypto from /usr/local/opt/openssl/lib. There is another, incompatible version in /usr/lib; we want to avoid linking against that one, so we’ll have to add the path to the one our custom libcurl uses with a -L linker option.

$ clang -o curl_example-static -Lcurl/lib/.libs/ -L/usr/local/opt/openssl/lib/ -lcurl -lssl -lcrypto -lz -framework LDAP curl_example.c
$ ./curl_example-static
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="fr"
><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop=
"image"><title>Google</title>...

Voilà!

Let’s inspect our new executable using otool.

$ otool -L curl_example-static
curl_example-static:
  /usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
  /usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
  /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5)
  /System/Library/Frameworks/LDAP.framework/Versions/A/LDAP (compatibility version 1.0.0, current version 2.4.0)
  /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

No mention of libcurl, we did it! Notice that we still need to link dynamically (at run-time) against the other libraries. But libcurl is not required at run-time; we can say that libcurl is statically linked to our executable.

Looking deeper

Inside curl_example-system

So what’s actually in our executables? We’ll use a tool called nm. nm lists and displays information about the symbols in our executable. Let’s try running it on our curl_example-system executable.

$ nm -m curl_example-system
                 (undefined) external ___stderrp (from libSystem)
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
                 (undefined) external _curl_easy_cleanup (from libcurl)
                 (undefined) external _curl_easy_init (from libcurl)
                 (undefined) external _curl_easy_perform (from libcurl)
                 (undefined) external _curl_easy_setopt (from libcurl)
                 (undefined) external _curl_easy_strerror (from libcurl)
                 (undefined) external _fprintf (from libSystem)
0000000100000e40 (__TEXT,__text) external _main
                 (undefined) external dyld_stub_binder (from libSystem)

Remember from using otool that curl_example-system links dynamically to libcurl and libSystem. We can see that every symbol linked from one of those libraries is listed as (undefined), meaning they aren’t defined in our executable, but rather in another object file, linked at run-time. The only two symbols that are defined in our executable (and located in the __TEXT segment of the __text section) are __mh_execute_header and _main. __mh_execute_header is the “mach header” of a Mach-O executable file, used by the OS to identify an executable file, and _main is the main function of our program. All other symbols are either symbols from the standard C library or from the cURL library.

To learn more about what a “segment” and a “section” are and more info about the internals of an executable file, read Anatomy Of A Program In Memory. (Linux-specific, but a lot of generic information inside).

Inside curl_example-dynamic

Let’s try curl_example-dynamic.

$ nm -m curl_example-dynamic
                 (undefined) external ___stderrp (from libSystem)
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
                 (undefined) external _curl_easy_cleanup (from libcurl)
                 (undefined) external _curl_easy_init (from libcurl)
                 (undefined) external _curl_easy_perform (from libcurl)
                 (undefined) external _curl_easy_setopt (from libcurl)
                 (undefined) external _curl_easy_strerror (from libcurl)
                 (undefined) external _fprintf (from libSystem)
0000000100000e40 (__TEXT,__text) external _main
                 (undefined) external dyld_stub_binder (from libSystem)

(Un)surprisingly, we get exactly the same output, since both executables link dynamically to the same libraries. We know this from using otool. We also know from otool that a version and path are stored for each dynamic library we link to, making sure that -system uses the default libcurl and that -dynamic uses our custom build.

Inside curl_example-static

Finally, let’s look at curl_example-static.

$ nm -m curl_example-static
                 (undefined) external _ASN1_INTEGER_get (from libcrypto.1)
                 (undefined) external _ASN1_STRING_data (from libcrypto.1)
                 (undefined) external _ASN1_STRING_length (from libcrypto.1)
...
                 (undefined) external _SSL_CIPHER_get_name (from libssl.1)
                 (undefined) external _SSL_CTX_add_client_CA (from libssl.1)
                 (undefined) external _SSL_CTX_check_private_key (from libssl.1)
...
                 (undefined) external ___bzero (from libSystem)
                 (undefined) external ___error (from libSystem)
                 (undefined) external ___maskrune (from libSystem)
...
0000000100021cf7 (__TEXT,__text) external _curl_easy_cleanup
0000000100021e66 (__TEXT,__text) external _curl_easy_duphandle
000000010001b2d9 (__TEXT,__text) external _curl_easy_escape
0000000100021d9e (__TEXT,__text) external _curl_easy_getinfo
0000000100021a1d (__TEXT,__text) external _curl_easy_init
0000000100022098 (__TEXT,__text) external _curl_easy_pause
0000000100021b0c (__TEXT,__text) external _curl_easy_perform
000000010002213c (__TEXT,__text) external _curl_easy_recv
0000000100022017 (__TEXT,__text) external _curl_easy_reset
00000001000221f3 (__TEXT,__text) external _curl_easy_send
0000000100021a60 (__TEXT,__text) external _curl_easy_setopt
000000010002a776 (__TEXT,__text) external _curl_easy_strerror
000000010001b554 (__TEXT,__text) external _curl_easy_unescape
...
                 (undefined) external _inflate (from libz)
                 (undefined) external _inflateEnd (from libz)
                 (undefined) external _inflateInit2_ (from libz)
                 (undefined) external _inflateInit_ (from libz)
...
                 (undefined) external _ldap_err2string (from LDAP)
                 (undefined) external _ldap_first_attribute (from LDAP)
                 (undefined) external _ldap_first_entry (from LDAP)
...

Woah! That’s quite a load of symbols. Where do all these symbols come from?

First, let’s notice that all symbols from libcurl are now visible in the executable, and that they aren’t (undefined) anymore, but defined in (__TEXT,__text). This confirms that by linking statically to libcurl, we actually wrote the code from the library into our executable, dispensing with the need to find those symbols elsewhere at run-time.

Now let’s look at all the other new (undefined) symbols. nm is nice enough to mention where the symbols are located. Remember from building our -static executable and from using otool that we had to specify all the libraries libcurl required on the command line for the linker. We can see now that we have a bunch of (undefined) symbols for all those libraries. Just like the symbols marked from libcurl in our previous builds, we need to have libcrypto, libssl (both part of OpenSSL), libz and the LDAP framework installed on the system for our program to run.

Further reading

While this article does not go in depth about static vs. dynamic linking, it does shed a lot of light on many lesser known developer tools in Mac OS X. To learn more about how Apple’s developer tools compile code into an executable, you can look into these links, which were used extensively while writing this article.

  1. clang User’s Manual
  2. man ld
  3. man otool
  4. man dyld
  5. man nm
  6. Anatomy of a Program in Memory - Gustavo Duarte
  7. Mach-O executable - objc.io