Archive for March, 2013

A tale of hidden symbols, weakness and gold

A few days ago I was pro­fil­ing the startup time of the clang com­piler. The call­grind report high­lighted that a rel­a­tively high amount of time was being spent in the _dl_lookop_symbol_x. A quick gdb inspec­tion of the call stack quickly pointed out that the dynamic linker was spend­ing the time doing dynamic relo­ca­tions. Tons of them.

Using objdump -R clang I found out that a whop­ping 42% of them actu­ally con­tained the word “clang” in the man­gled sym­bol names. Deman­gling the names made it clear that they were mostly def­i­n­i­tions of the C++ vir­tual tables for clang inter­nal classes.

Basi­cally 42% of the relo­ca­tions hap­pen­ing at run time were actu­ally look­ing for sym­bols which are obvi­ously defined inside the clang exe­cutable itself. So what’s hap­pen­ing here?

Turns out the prob­lem is a poor inter­ac­tion between the tra­di­tional linker (ld, from the binu­tils pack­age) and C++‘s ODR rule.

ODR stands for One Def­i­n­i­tion Rule and says that any entity in C++ code must be defined only once. This is obvi­ously triv­ial for meth­ods, since if they are defined twice, even in dif­fer­ent trans­la­tion units, there will be an error at link time. There are, though, a few things that are implic­ity defined by the com­piler, for exam­ple the vir­tual tables, which are defined when­ever a class using vir­tual meth­ods is declared. (Here I’m speak­ing infor­mally, I sus­pect the ter­mi­nol­ogy is not 100% cor­rect) Since the vtable is poten­tially defined in more than a trans­la­tion unit, the sym­bols for the vtable are flagged as weak. Weak sym­bols will not con­flict with each other and they will be all dis­carded but one (any one is fine). The sur­viv­ing one will be used by the end prod­uct of the linker.

Unfor­tu­nately, the com­piler does not know if the com­piled object file will be used as part of a dynamic library or as part of a main exe­cutable. This means that it has to treat the vtable sym­bols like any library method, since (if the tar­get is a dynamic library) poten­tially such sym­bols may be over­rid­den by another def­i­n­i­tion in a library or in the main exe­cutable. This has to hap­pen since the ODR rule must also apply across the library bor­ders for proper sup­port of a few things, espe­cially excep­tion handling.

So at com­pile time there is no way around using dynamic relo­ca­tion on the vtable sym­bols. A pos­si­ble workaroud would be to com­pile all the code with the -fvisibility=hidden flag. Unfor­tu­nately this is actu­ally wrong since it may break the ODR rule!

At link time the linker has the chance of elim­i­nat­ing the dynamic relo­ca­tion, since it does know if the tar­get is a main exe­cutable and sym­bols defined in an exe­cutable can­not be over­rid­den by any­thing (not even by LD_PRELOADed libraries). Unfor­tu­nately the tra­di­tional ld linker does not apply this optimization.

The new gold linker, orig­i­nally devel­oped at Google, does tough! It is able to com­pletely erad­i­cate the dynamic relo­ca­tions to inter­nally defined sym­bols, effec­tively reduc­ing the load time.

Moral of the tale: use the gold linker. It should work in most cases (I think ker­nel code is a notable excep­tion) and gen­er­ate faster exe­cuta­bles while con­sum­ing less mem­ory and cpu time dur­ing linking.

And please, dear debian/ubuntu main­tain­ers, link clang using gold.

1 Comment

A tale of hidden symbols, weakness and gold

A few days ago I was pro­fil­ing the startup time of the clang com­piler. The call­grind report high­lighted that a rel­a­tively high amount of time was being spent in the _dl_lookop_symbol_x. A quick gdb inspec­tion of the call stack quickly pointed out that the dynamic linker was spend­ing the time doing dynamic relo­ca­tions. Tons of them.

Using objdump -R clang I found out that a whop­ping 42% of them actu­ally con­tained the word “clang” in the man­gled sym­bol names. Deman­gling the names made it clear that they were mostly def­i­n­i­tions of the C++ vir­tual tables for clang inter­nal classes.

Basi­cally 42% of the relo­ca­tions hap­pen­ing at run time were actu­ally look­ing for sym­bols which are obvi­ously defined inside the clang exe­cutable itself. So what’s hap­pen­ing here?

Turns out the prob­lem is a poor inter­ac­tion between the tra­di­tional linker (ld, from the binu­tils pack­age) and C++‘s ODR rule.

ODR stands for One Def­i­n­i­tion Rule and says that any entity in C++ code must be defined only once. This is obvi­ously triv­ial for meth­ods, since if they are defined twice, even in dif­fer­ent trans­la­tion units, there will be an error at link time. There are, though, a few things that are implic­ity defined by the com­piler, for exam­ple the vir­tual tables, which are defined when­ever a class using vir­tual meth­ods is declared. (Here I’m speak­ing infor­mally, I sus­pect the ter­mi­nol­ogy is not 100% cor­rect) Since the vtable is poten­tially defined in more than a trans­la­tion unit, the sym­bols for the vtable are flagged as weak. Weak sym­bols will not con­flict with each other and they will be all dis­carded but one (any one is fine). The sur­viv­ing one will be used by the end prod­uct of the linker.

Unfor­tu­nately, the com­piler does not know if the com­piled object file will be used as part of a dynamic library or as part of a main exe­cutable. This means that it has to treat the vtable sym­bols like any library method, since (if the tar­get is a dynamic library) poten­tially such sym­bols may be over­rid­den by another def­i­n­i­tion in a library or in the main exe­cutable. This has to hap­pen since the ODR rule must also apply across the library bor­ders for proper sup­port of a few things, espe­cially excep­tion handling.

So at com­pile time there is no way around using dynamic relo­ca­tion on the vtable sym­bols. A pos­si­ble workaroud would be to com­pile all the code with the -fvisibility=hidden flag. Unfor­tu­nately this is actu­ally wrong since it may break the ODR rule!

At link time the linker has the chance of elim­i­nat­ing the dynamic relo­ca­tion, since it does know if the tar­get is a main exe­cutable and sym­bols defined in an exe­cutable can­not be over­rid­den by any­thing (not even by LD_PRELOADed libraries). Unfor­tu­nately the tra­di­tional ld linker does not apply this optimization.

The new gold linker, orig­i­nally devel­oped at Google, does tough! It is able to com­pletely erad­i­cate the dynamic relo­ca­tions to inter­nally defined sym­bols, effec­tively reduc­ing the load time.

Moral of the tale: use the gold linker. It should work in most cases (I think ker­nel code is a notable excep­tion) and gen­er­ate faster exe­cuta­bles while con­sum­ing less mem­ory and cpu time dur­ing linking.

And please, dear debian/ubuntu main­tain­ers, link clang using gold.

1 Comment