C++17 Compiler Bug Hunt: Very First Results (12 bugs reported, 3 already fixed)

 
Author:  Follow: TwitterFacebook
Job Title:Sarcastic Architect
Hobbies:Thinking Aloud, Arguing with Managers, Annoying HRs,
Calling a Spade a Spade, Keeping Tongue in Cheek
 
 
bug hunting

Preamble

Hare with an idea:Hey, if all I have is lemons - I have no choice but to make lemonade out of them! Recently, I have run into quite a few issues with different C++17 compilers (actually, I found that none of the compilers-I-tried is really C++17 compliant yet); after the number of problems grew over a half-a-dozen – I thought “Hey, if all I have is lemons – I have no choice but to make lemonade out of them!”.

Enter ithare::kscope

Enter [ithare::kscope] (“kscope” being a short “kaleidoscope”). ithare::kscope is a header-only library which can generate “kaleidoscoped” versions of certain pre-prepared code. In a nutshell, ithare::kscope works as follows:

  • you take ithare::kscope (which is very small BTW; at the moment it is about 5K LOC, nothing compared to the amount of trouble it has caused <wink />).
    • ithare::kscope consists of three distinct layers:
      • a header-only library which enables “kaleidoscoping” of the code (can be found in “/src/” folder)
        • I will explain what I understand as “kaleidoscoping”, in a jiff.
      • header-only “kaleidoscoped” libraries (can be found in “/kaleidoscoped/” folder).
        • These are essentially forks of the open-source code, adapted to enable “kaleidoscoping” (with LOTS of textual changes to enable “kaleidoscoping”, but trying to preserve the substance of the original).
        • Currently – it is only chacha20 from libressl which is kaleidoscoped, but it already allowed to report a dozen of different bug-suspects over three different compilers.
          • I will work on extending the list of “kaleidoscoped” libraries, in particular, to allow for a minimal implementation of “kaleidoscoped” TLS.
          • Hare with smiley sign:you can have compile-time cryptoBTW, as a side benefit – kaleidoscoped functions can be used as a compile-time constexpr (which means that you can have compile-time crypto1, and in the future – things such as compile-time parsing, etc.).
      • test code (both .cpp and .sh/.bat scripts) to compile-and-run on target box under target compiler (most recent release versions of three most popular compilers are supported, see below for more details).
  • you compile the ithare::kscope test code, specifying a random value for -DITHARE_KSCOPE_SEED .
    • This is where the magic happens. ithare::kscope ensures that “kaleidoscoped” libraries are compiled to a different-but-supposedly-equivalent code with each different value of ITHARE_KSCOPE_SEED. More on mechanics of it below.
  • you run the randomly-“kaleidoscoped” instance of the code to make sure that unit tests run ok.
    • IF there are issues – it is a bug either in the ithare::kscope (which still happens, but remaining issues will hopefully be ironed out sooner rather than later), or in the compiler (which has already been observed in practice more than once).
    • Rinse and repeat – with different ITHARE_KSCOPE_SEED.
  • All of the above is done automagically by running /test/nix/randomtest.sh and /test/win/randomtest.bat scripts within the [ithare::kscope] project.
    • due to the random nature of the code – the longer we’re running the same randomized tests, the more chance there is to find a bug.
      • Surprised hare:as with anything else, a law of diminishing returns applies; in other words, there are much more chances to find a bug during randomized runs 1-10, than during the runs numbered 991-1000. OTOH, as with anything else, a law of diminishing returns applies; in other words, there are much more chances to find a bug during randomized runs 1-10, than during the runs numbered 991-1000; however, I’ve already seen bugs manifested on run 700+, and will probably see more.
  • Also, ithare::kscope will allow writing your own extensions (as your own “injections” – more on them in “mechanics” section below).
    • In other words, if you suspect compiler to work incorrectly with a certain specific construct (such as certain intrinsic) – you can write your own “injection” with this intrinsic, and the magic of ithare::kscope will try to use your injection (alongside with its own stock injections) to produce as many related variations as possible.
  • Currently-tested compilers/platforms (in alphabetical order):
    • Clang/Linux (“stock” Clang), 5.0.1 and top-of-the-trunk, both 64-bit and 32-bit
    • Clang/Mac (“Apple LLVM”) 9.0.0, both 64-bit and 32-bit
    • GCC/Linux 7.2.0, both 64-bit and 32-bit
    • MSVC/Win 19.12.25835 (coming from VS 15.5.5), both 64-bit and 32-bit
    • I hope to add Clang/C2 for Visual Studio at some point

1 as noted above – only chacha20 for the time being, but the list is going to grow

 

“kaleidoscoping” mechanics

The idea of the “kaleidoscoped” code is to have binary code change drastically, while keeping source code exactly the same. This is achieved by using ITHARE_KSCOPE_SEED as a seed for a compile-time random number generator, and [ithare::kscope] being a recursive generator of randomized code. Very very briefly, it can be described as follows:

  • At the top level – there are macros (such as ITHARE_KSCOPE_INT3) which instantiate templates (such as KscopeInt<>), using constexpr-functions-of __FILE__-and-__LINE__ to generate different values of the ‘seed’ template parameter for template classes
    • In ITHARE_KSCOPE_INT3, ‘3’ describes ‘CPU cycle budget’ which we allow our recursive generator to use. The scale is logarithmic, so that ‘_INT1’ very roughly means ‘3 cycles’, ‘_INT2′ ~= ’10 cycles’, and ‘_INT6’ ~= ‘1000 cycles’.
  • Injection In mathematics, an injective function or injection or one-to-one function is a function that preserves distinctness— Wikipedia —KscopeInt<> uses seed template parameter (the one provided by macro) to instantiate an ‘injection’ (KscopeInjection<>). Here, we’re speaking about ‘injection’ in a mathematical sense (~=”some reversible function”).
    • To ensure maximum-possible “kaleidoscoping”, all the seeds in the compile are used only once – if there is a chance for a seed to be re-used – necessary compile-time pseudo-randoms are generated out of it (see ITHARE_KSCOPE_NEW_PRNG macro).
  • KscopeInjection<>, in turn, (pseudo-)randomly instantiates one of KscopeInjectionVersion<> classes.
    • There are currently 6 non-trivial injections provided (ranging from “multiply by an odd constant modulo 2^32” to a Feistel round), and this list of injections can be extended further with whatever-you-want (example for such extending can be found in src/kscope_sample_extension.h and example of using extension file – in test/officialtest.cpp).
    • Most of the injections, instantiate KscopeInjection<> within them again, causing recursion.
      • This recursive injection, however, has less allocated CPU cycles for it, which means that the recursion will naturally stop (as the remaining CPU cycle budget goes to zero) sooner rather than later.

IMO, it is pretty elegant, eh? <wink />

ithare::kscope as an Automated Stress-Testing Tool for C++17 Compilers

Apparently, [ithare::kscope] is a serious tool for compiler testing – during just first 3 weeks of its operation, it already allowed to find a dozen of bugs across different compilers. From what I’ve seen so far, all the bugs found by ithare::kscope can be divided into three broad categories:

  • bugs due to extensive use of C++14/C++17 in ithare::kscope itself (including /kaleidoscoped/ folder). Indeed, ithare::kscope uses a LOT of C++14+ constructs (it especially heavily uses-and-abuses constexpr functions and variables, seriously recursive template instantiations using non-type template parameters, and so on). These bugs tend to manifest themselves pretty early during the testing; up to now, these bugs constitute a majority of all-the-bugs-reported.
  • bugs due to a specific random value of ITHARE_KSCOPE_SEED. Up to now, IIRC, there were only two such bugs reported, but I have my further suspicions, and will try to check them as the time permits. These bugs, in turn, can be one of the following:
    • bugs in compiler front-end (due to not-so-usual patterns during randomized template instantiations)
    • bugs in compiler back-end (due to not-so-usual patterns in the randomly-generated intermediate code)
  • Hare asking question:Hey, let's just write the code and then The Almighty Compiler will do Everything-You-Might-Need and more!Lack-of-obvious-and-expected-optimizations. While lack of optimizations is arguably a non-bug, there is LOTS of rhetoric in recent years which goes along the lines of “Hey, let’s just write the code and then The Almighty Compiler will do Everything-You-Might-Need and more!”.2 Very preliminary results by ithare::kscope seem to indicate that there are certain cases when even such a trivial-and-expected-to-be-no-cost code-change as wrapping-some-function-in-an-another-layer-of-supposedly-inlined-function, can reduce performance of compiled executable by a factor of 10x(!); whether compiler writers will consider it a bug or not – it is their call, but I am sure that development community should know about such performance abominations (especially as compiler writers started to abuse UBs in the name of performance gains, I’d argue that before abusing UBs, they should fix those 10x-degradations-in-very-expected-cases).
    • Another similar (though not-directly-related) problem is the time required to compile ithare::kscope generated code (and amount-of-RAM necessary to do it); while I didn’t notice too much difference in compile times, there seems to be a significant difference in  RAM appetites of different compilers. I will certainly take a closer look at these behaviours.

2 BTW, I would really like to live in a world where it stands, the only problem is that it doesn’t <sad-face />

 

For Compiler Writers

If, by any chance, you are a compiler writer and want to use ithare::kscope to test a new version (for example, to avoid your new version being reported as having a regression <wink />), the recommended way is to:

  • grab the last version from [ithare::kscope] master branch (it is supposed to stay compilable most of the time).
  • depending on your platform, run /test/nix/randomtest.sh or /test/win/randomtest.bat
If you want to perform some advanced testing – you may want to add your own ‘injections’, so you can test whatever-recent-optimizations-you-made (or whatever-intrinsics-you-have-added/changed) in different surroundings

(for details – see below re. __asm__ testing for the rest-of-us).

While due to its very nature, ithare::kscope cannot possibly provide any guarantees about a compiler to be bug-free (it is just a yet-another tool to try shake your compiler, hoping that some of existing bugs will start falling out) – it has already allowed to report a dozen of bugs across 3 different compilers. Most of the bugs were at least acknowledged (and three of them being already fixed – THANKS to everybody who contributed to these fixes!).

For the Rest of Us

For the rest of us (for(auto me: us){if(!(me.flags&is_compiler_writer)){…}}), all this mechanics is not really that important. However, there are still two things which might be of certain interest.

The first point of interest is testing our own code (especially GCC-style __asm__ code with constraints, which are notoriously easy to write incorrectly, with these incorrectly-written constraints sitting in the code for ages until they’re placed into a different context to start hurting us badly)

The idea is to use our __asm__  within ithare::kscope injections (see src/kscope_sample_injection.h for an example of adding your own injection to the mix) – and ithare::kscope works pretty good with regards to placing our __asm__ into very different surroundings, effectively testing constraints much better than we can do it manually (no warranties of any kind, but it might catch a bug or three).

Current Observations

“Vantage number two!” said the Bi-Coloured-Python-Rock-Snake.

“You couldn’t have done that with a mere-smear nose.”

— Bi-Coloured-Python-Rock-Snake from The Elephant's Child

The second point of potential interest “for the rest of us”, is related to certain observations which were made during ithare::kscope development and testing (hey, we all know which compiler is the best from the point of view of C++17 compliance – and more generally, from the point of view of bugs-we’re-likely-to-encounter, with their respective workarounds). Here goes the table of the bugs reported against very different compilers during just a few first weeks of ithare::kscope project:

Clang GCC3 MSVC
Reported-and-Fixed Bugs (GOOD) #36055 (fixed in trunk in 2 weeks after my report, fix reportedly scheduled to apply to Clang 6.0; THANKS to everybody who contributed to the fix!)

 

#195484 (reportedly fixed in a week after my report, fix scheduled to apply to VS2017 15.7 Preview 2; THANKS to everybody who contributed to the fix!)

#195579 (reportedly, was fixed before my report in a recent preview)

#196885 (reportedly fixed in 3 days(!) after my report, fix scheduled to apply to VS2017 15.7 Preview 2; THANKS to everybody who contributed to the fix!)

Reported-and-Pending Bugs (BAD) #36333 (only Clang 5.0.x, upcoming Clang 6.0 seems to be unaffected) #47488  (it seems that I run into a new manifestation of the old bug, tried to bump it)

#84401 (an enhancement, don’t take it too seriously)

#84463

#195483 (reportedly already-known to MSFT internally but not fixed yet)

#195665 (reportedly already-known to MSFT internally but not fixed yet; OTOH, some MSFT ppl say this incompliance is actually a feature (really?))

#196900 (reportedly already-known to MSFT internally but not fixed yet)

#199554


3 NB: at early stages of ithare::kscope development, I tested only with MSVC and Clang, so there is a chance I could have avoided running into some GCC bugs simply because I didn’t test with GCC at those early stages. In other words – this column being near-empty it is NOT (yet) an indication that GCC has fewer bugs; neither it is (yet) an indication that GCC is less willing to fix their bugs.

 

With regards to the table above, I want to emphasise that

IMNSHO, it is NOT the number-of-bugs-found which matters the most; rather, it is the rate of bug fixing which is most important in the long run

Wtf hare:if reproducible serious bugs are ignored for many months, it indicates one of three possibilitiesIndeed, everybody has bugs in their code (even me <wink />); what is most important – is that reproducible bugs are fixed reasonably fast4. IMO, if reproducible serious5 bugs are ignored for many months, it indicates one of three possibilities:

  • Nobody in the team cares about bug reports. Pretty much hopeless.
    • Nobody in the team cares about specific bug report (considering it too minor, too little benefit-for-work, etc.). While I can accept this excuse for some time and for one bug – as soon as it becomes a pattern, it basically leads us to the previous “nobody cares” point.
  • The code is so fragile nobody dares to touch it. Never a good sign for not-supposed-to-be-frozen codebases.
  • There are sooo many other bugs to fix, that the team has no time for supposedly-less-important stuff. Which is certainly not a good sign either.

As one of my former managers6 has said: “each and every bug should be fixed on the day when it is reported”. Well, “the same day” is probably too much to ask for compilers, but having bugs sitting there for ages is IMNSHO even worse than having them there in the first place.


4 BTW, all the response times listed in the table above do pass as “reasonably fast” with flying colors; actually, anything which is “within a few months”, isn’t too bad for compilers (with brownie points to teams which can fix things faster than that). To give an idea about “what is unreasonably slow” – I can remember reporting a compiler bug back in 97, just for it to sit there until 2004(!); see also GCC’s bug #47488 above (first reported in 2011, and still counting).
5 enhancements are not included, but compiler crashes, codegen issues, and standard incompliances IMNSHO do qualify as ‘serious’
6 FWIW, he was recently reported to be a billionaire by Bloomberg

 

Larger Picture

Overall, the results above are very preliminary by design – and we have to see how the things will unfold in the future.  The key point of the whole ithare::kscope exercise is

to help making C++17 compilers as bug-free as possible (including both front-end and back-end)

And on this way, it doesn’t matter too much how many bugs compilers have now; what really matters is how many bugs they will have in a half a year from now (which, in turn, will depend on the rate of the bug fixing discussed above).  In other words –

The race for the-least-buggy-C++17-compiler has just begun. Let’s see how the contestants will fare in the long run.

 

How You Can Help

If you have time to help me on this journey to make making C++17 compilers a bit better – your help could be VERY important and VERY useful. In particular, at least three fields are of particular interest:

  • running existing ithare::kscope randomtest.sh/randomtest.bat tests – and be ready to report bugs to respective compiler writers too. To do it – find a box which is not-too-shabby in terms of RAM (at least 4G RAM+4G swap is recommended), then clone the latest master branch of [ithare::kscope] there, and you should be good to go. Running existing tests might help to reveal bugs under:
    • compilers-other-than-those-listed-above (will likely require some code changed)
    • versions of compilers other-than-those-listed-above (supposed to work, but you never know)
      • of special interest are regressions in top-of-the-trunk compilers
    • platforms other-than-those-listed-above (in theory, should be ok – but the less common the platform is, the more chance that bugs in it are remaining there for ages <sad-face />).
    • Oh, and if/when you find some bug (which may or may not happen):
      • try to check whether it can be a bug in the ithare::kscope (it can happen)
      • after reporting the bug to the respective compiler team – please drop me a line, I will add your bug report to the list-of-bugs-being-tracked too.
  • adding new injections (along the lines of src/kscope_sample_extension.h); they’re really important to have as-much-stuff-tested as possible. Ideally – there would be whole libraries of different injections, so even more scenarios can be tested.
  • for the most adventurous <smile /> – adding new kaleidoscoped functions, also to increase the number of the scenarios-being-tested, though in a completely different space. Currently, of the most interest are (a) not-system-related std:: functions (such as algorithms and containers), and (b) crypto-functions and parsers (in particular, from TLS); (a) will allow to kaleidoscop things further, and (b) will allow to have more of compile-time crypto (and things such as compile-time ASN.1 parser, etc.). If you’re interested – take a look at kaleidoscoped/ folder, and see whether you like it (this is NOT documented yet, but I am here to answer any questions you may have).
In any case – stay tuned! (FWIW, I hope to write about the progress with ithare::kscope on monthly basis).
Don't like this post? Comment↯ below. You do?! Please share: ...on LinkedIn...on Reddit...on Twitter...on Facebook

[+]References

Acknowledgement

Cartoons by Sergey GordeevIRL from Gordeev Animation Graphics, Prague.

Join our mailing list:

Comments

  1. says

    Nice work.

    > Indeed, everybody has bugs in their code (even me )

    Hah.

    > for(auto me: us){if(!me.flags&is_compiler_writer){…}}

    The ! operator has higher precedence than the & operator. You’re missing some parentheses.

    • "No Bugs" Hare says

      >> Indeed, everybody has bugs in their code (even me )

      > Hah.

      With my name being ‘No Bugs’, some may think I shouldn’t make any bugs at all ;-).

      >> for(auto me: us){if(!me.flags&is_compiler_writer){…}}

      > The ! operator has higher precedence than the & operator. You’re missing some parentheses.

      You’re right, fixed (which once again shows importance of our programs being _testable_ ;-)).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.