Quantcast
Channel: Hacker News
Viewing all articles
Browse latest Browse all 25817

Measuring execution performance of C++ exceptions vs. error codes

$
0
0
In an earlier blog post we found out that C++ exceptions produce smaller executables than corresponding code using error codes. The most common comment to it was that executable size is not that important, performance is what matters. Since we have the tooling, let's do perf measurements as well.

Test setup

We generate code that calls a few other functions that call other functions and so on until we reach the bottommost function. In that function we either raise an exception or error or return success. This gives us two parameters to vary: the depth of the call hierarchy and what percentage of calls produce an error.

The code itself is simple. The C++ version looks like this:

int func0() {
    int num = random() % 5;
    if(num == 0) {
        return func1();
    }
    if(num == 1) {
        return func2();
    }
    if(num == 2) {
        return func3();
    }
    if(num == 3) {
        return func4();
    }
    return func5();
}

We generate a random number and based on that call a function deeper in the hierarchy. This means that every time we call the test it goes from the top function to the final function in a random way. We use the C random number generator and seed it with a known value so both C and C++ have the same call chain path, even though they are different binaries. Note that this function calls on average a function whose number is its own number plus three. So if we produce 100 functions, the call stack is on average 100/3 = 33 entries deep when the error is generated.

The plain C version that uses error objects is almost identical:

int func0(struct Error **error) {
    int num = random() % 5;
    int res;
    if(num == 0) {
        res = func1(error);
    } else if(num == 1) {
        res = func2(error);
    } else if(num == 2) {
        res = func3(error);
    } else if(num == 3) {
        res = func4(error);
    } else {
        res = func5(error);
    }
    /* This is a no-op in this specific code but is here to simulate
     * real code that would check the error and based on that
     * do something. We must have a branch on the error condition,
     * because that is what real world code would have as well.
     */
    if(*error) {
        return -1;
    }

    return res;
}

The code for generating this code and running the measurements can be found in this Github repo.

Results

We do not care so much about how long the executables take to run, only which one of them runs faster. Thus we generate output results that look like this (Ubuntu 16/10, GCC 6.2, Intel i7-3770):

CCeCCCCCCC
CCCCCCccec
eEcCCCeECE
cceEEeECCe
EEEcEEEeCE
EEEEEeccEE
EEEEEEEEEE
EEEEEEEEEE
EEEEEEEEEE
EEEEEEEEEE

In this diagram C means that plain C error codes are faster and E that exceptions are faster for that particular pair of parameters. For easier reading all cases where error codes win have been given a red background color. Each row represents a different number of functions. The top row has 50 functions (call depth of 16) and each row adds 50 so the bottommost row has 500 functions.

The columns represent different error rates. The leftmost column has 0% errors, the next one has 1% and so on. A capital letter means that the runtime difference was bigger than 10%. Each executable was executed five times and the fastest run was taken in each case. Full results look like the following:

GCC 6.2      Clang 3.8.1  Clang 4.0 trunk

CCeCCCCCCC  EecCcCcCcC  ECcECcEeCC
CCCCCCccec  CceEEeEcEC  EEeEECEcCC
eEcCCCeECE   EEEEEECEeE   EEEECECeCC
cceEEeECCe   EcEeEECEEE   EEEEceeEeC
EEEcEEEeCE   EEEeEEEEEe   EEEEEEEEEE
EEEEEeccEE   EEEEEEEEEE   EEEEEEEEEE
EEEEEEEEEE   EEEEEEEEEE   EEEEEEEEEE
EEEEEEEEEE   EEEEEEEEEE   EEEEEEEEEE
EEEEEEEEEE   EEEEEEEEEE   EEEEEEEEEE
EEEEEEEEEE   EEEEEEEEEE   EEEEEEEEEE


Immediately we see that once the stack depth grows above a certain size (here 200/3 = 66), exceptions are always faster. This is not very interesting, because call stacks are usually not this deep (enterprise Java notwithstanding). For lower depths there is a lot of noise, especially for GCC. However we can see that for small error rates exceptions are sometimes faster, especially for Clang. Because of the noise we redid the measurements several times. This changed the individual measurements but produced the same overall shape where small errors seem to be faster with exceptions but as the error rate grows error codes become faster.

On a Raspberry Pi the results look like this:

GCC 4.9.2

eeccCCCCCC
EeeeccccCC
Eeeeeecccc
Eeeeeeeccc
EEeeeeeeec
EEEEEeeeee
EEEEEEEEee
EEEEEEEEEE
EEEEEEEEEE
EEEEEEEEEE

Here the results are a lot clearer and consistent, probably due to the simpler architecture of the ARM processor. Exceptions are faster except for small call depths and large error rates. When using 50 functions error objects only become noticeable faster at 4% error rate.

Finally let's do the same measurements on an i5-6600K.

GCC 4.8.4    gcc 5.4.0   Clang 3.{4, 6, 8}

ECCCCCCCCCC eeccCCCCCC EeeccCCCCC
EecCCCCCCCC EEEeeeeccc EEEEEeeeee
EEecCCCCCCC EEEEEEeeee  EEEEEEEEEE
EEEeecCCCCC EEEEEEEEEe  EEEEEEEEEE
EEEEeecccCC EEEEEEEEEE  EEEEEEEEEE
EEEEEeeeccc EEEEEEEEEE  EEEEEEEEEE
EEEEEEeeeec EEEEEEEEEE  EEEEEEEEEE
EEEEEEEEeee  EEEEEEEEEE  EEEEEEEEEE
EEEEEEEEEee  EEEEEEEEEE  EEEEEEEEEE
EEEEEEEEEEe  EEEEEEEEEE  EEEEEEEEEE

Here we see that the compiler used can make a massive difference. GCC 4.8.4 favors error codes but 5.4.0 is the opposite as are all versions of Clang (which produce identical results). On this platform exceptions seem to be even more performant than on the Pi.

The top left corner seems to be the most interesting part of this entire diagram, as it represents shallow call chains with few errors. This is similar to most real world code used in, for example, XML or JSON parsing. Let's do some more measurements focusing on this area.

i5               Raspberry Pi

GCC    Clang     GCC    Clang
5.4.0  3.8       4.9.2  3.5.0

cCCCC eCCCC    eCCCC eCCCC
ecCCC EeCCC    ecCCC ecCCC
ecCCC EeccC    eccCC eccCC
eccCC EEecc    eecCC eecCC
eeccC EEEee     eeccC eeccC


In these measurements the parameters go from 10 functions at the top row to 50 in the bottom row in increments of 10 and the colums go from 0% error on the left to 4% on the right in increments of 1. Here we see that exceptions keep their performance advantage when there are no errors, especially when using Clang, but error codes get better as soon as the error rate goes up even a little.

Conclusions

Whenever a discussion on C++ exceptions occurs, there is That Guy who comes in and says "C++ exceptions are slow, don't use them". At this point the discussion usually breaks down on fighting about whether exceptions are fast or not. These discussion are ultimately pointless because asking whether exceptions are "fast" is the wrong question.

The proper question to ask is "under which circumstances are exceptions faster". As we have seen here, the answer is surprisingly complex. It depends on many factors including which platform, compiler, code size and error rate is used. Blanket statements about performance of X as compared to Y are usually simplifying, misleading and just plain wrong. Such is the case here as well.

(Incidentally if someone can explain why i7 has those weird performance fluctuations, please post in the comments.)

Thanks to Juhani Simola for running the i5 measurements.


Viewing all articles
Browse latest Browse all 25817

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>