Running simulation code using GEANT4 (large Monte Carlo C++ simulation framework, lots of shared libraries). Compiled and linked GEANT and my app with gold linker and with standard BFD based linker. Looks like gold one is running a bit faster (1 ’47” vs 1’51”). Could someone shed a light what would be the reason for the difference? Ubuntu 15.04, 64bit, GCC 4.9.2. Run each test about 10 times, lowest time taken, no other activity, one terminal.
Use GEANT4 (large-scale Monte Carlo C++ simulation framework, a large number of shared libraries) to run the simulation code. Compiling and linking GEANT and my application using the gold linker and the standard BFD based linker. It looks like gold runs faster (1’47” vs 1’51”). Can anyone explain what the reason for this difference is? Ubuntu 15.04, 64bit, GCC 4.9.2. Each test is run approximately 10 times, with a minimum of time, no other activity, and one terminal.
Naturally, different linkers will produce different results, just like different compilers do. The result mostly depends on the optimization options that are enabled (and available) on each linker. Here is one possible reason for the differences you see, but there can be numerous others:
Of course, different linkers will produce different results, just like different compilers. The results mainly depend on the optimization options enabled (and available) on each linker. This is one possible reason for the difference you are seeing, but there could be many others:
Perform Identical Code Folding for functions and read-only variables. The optimization reduces code size and may disturb unwind stacks by replacing a function by equivalent one with a different name. The optimization works more effectively with link time optimization enabled. Nevertheless the behavior is similar to Gold Linker ICF optimization, GCC ICF works on different levels and thus the optimizations are not same – there are equivalences that are found only by GCC and equivalences found only by Gold.
Performs the same code folding for functions and read-only variables. Optimizations reduce code size and can interfere with unwinding the stack by replacing functions with equivalent functions with different names. When link time optimization is enabled, the optimization is more effective. However, the behavior is similar to the Gold Linker ICF optimization, the GCC ICF works at a different level, so the optimization is different – only GCC can find equivalence, only Gold can find equivalence.
Last but not least: there are many environmental factors that can affect the runtime besides the actual binary content. E.g., cache thrashing can have a considerable effect on the execution time. Also, set of 10 executions is too small for statistical conclusions .
Last but not least: in addition to the actual binary content, there are many environmental factors that affect the runtime. For example, cache thrashing can have a considerable impact on execution time. Furthermore, 10 sets of executions is too small for statistical conclusions.
As far as the statistics go, lowest time taken is not a valid measure. If you are really curious you need to compute the average time to completion for each program, then divide the difference in the averages by the standard deviation of the pooled sample.
Minimum time is not a valid measure when it comes to statistics. If you’re really curious, you’ll need to calculate the average completion time for each program and then divide the difference in the averages by the standard deviation of the pooled sample.
Suppose both programs had the exact same average time to completion, but one always took the same amount of time, the other had huge variation. Picking the one with the single fastest completion would always choose the latter, even though the more consistent program is the one with better performance.
Suppose two programs have exactly the same average completion time, but one program always takes the same time and the other program has huge variations. Choosing the single fastest one will always choose the latter, even if the more consistent program is the one with better performance.