(Not Really So) New Niche for C++: Browser!?

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

[In chess annotation,] ‘!?’… usually indicates that the move leads to exciting or wild play but that the objective evaluation of the move is unclear

~ Wikipedia

For quite a long while, C++ had been losing popularity; for example, as reported in [Widman16], in 2016 it got 7% less of the listings on Dice.com compared with a year earlier; and according to [TIOBE17], from the C++ Golden Age in 2004 till 2017, the C++ share fell from ~17% to a measly 6%.

As all of us (as in, ‘hardcore C++ fans’) know , this has nothing to do with the deficiencies of C++; rather it is related to an observation that the time of downloadable clients (which was one of the main C++ strongholds) has changed into the time of browser-based clients – and all the attempts to get C++ onto browsers were sooo ugly (ActiveX, anyone?) that this didn’t really leave a chance to use C++ there.

Well, it seems that this tendency is already in the process of being reverted:

C++ can already run on all four major browsers – and moreover, it has several all-important advantages over JavaScript, too.

And this – not too surprisingly – is what this article is all about.

A word of warning: please do NOT expect any revelations here; this article is admittedly long overdue – and quite a few people know MUCH more than I can fit here (and MUCH more than know myself). Still, given the lack of such overviews intended for those of us who haven’t tried it yet, I am sure that such an article has its merits. In the article, I will try to provide a very high-level overview of Emscripten, of the technologies involved, of the performance which can be expected, of the APIs which can be used – and what we can gain from using it.

JavaScript to the rescue!

Attempts to get C++ on browsers were continuing all the time (such as (P)NaCl), but all of them were platform- (and/or browser-)specific, and (as a result) were very problematic for browser deployments. However, help for the C++ side of things has come from exactly the same rival which has been stealing the browser show for all these years – from JavaScript. It wasn’t easy, and took several all-important (and IMO ingenious) pieces of the puzzle to make it useful.

Piece I – asm.js

In 2013, so-called asm.js was released. Essentially, asm.js is just a very small subset of JavaScript, intended to simulate good old assembler. If we take a look at a real-world asm.js program (not hand-written, but compiled from C++), we’ll see something along the lines of Listing 1 [Resig13]:

function Vb(d) {
  d = d | 0;
  var e = 0, f = 0, h = 0, j = 0, k = 0, l = 0, m = 0, n = 0, o = 0, p = 0, q = 0, r = 0, s = 0;
  e = i;
  i = i + 12 | 0;
  f = e | 0;
  h = d + 12 | 0;
  j = c[h>>2] | 0;
  if ((j | 0) > 0) {
    c[h>>2] = 0;
  k = 0
  } else {
    k = j
  }
  j = d + 24 | 0;
  if ((c[j>>2] | 0) > 0) {
    c[j>>2] = 0
  }
...
}

As we can see, it is nothing like your usual high-level JavaScript, which deals with DOM and high-level onclick handlers. Instead (except from the if statements and function declarations) it directly translates into what we’d usually expect from an assembler language.

On taking a closer look, we can observe the following elements of more-or-less typical assembler in the code above:

registers (implemented as JavaScript vars)
ALU operations (via JavaScript doubles, but converting them into uint32_t all the time via | 0)
Ability to access memory (as one huge array; in the example above – c[])
Control operations (if and function)

Well, that’s pretty much all we need to get the full-scale assembler rolling.☹

For our current purposes, we don’t really want to go any deeper, but hopefully I’ve managed to describe the idea behind asm.js: essentially, it is pretty much a simulator of a strange CPU with a strange instruction set. In other words, asm.js did NOT try to simulate any existing instruction sets (and doing so would make it fatally inefficient).

Instead, asm.js has invented its own instruction set, which can be still seen as an instruction set of a CPU, at least from the point of view of a C++ compiler.

Piece II – LLVM/Emscripten

The above observation has made it possible to write a back-end for the LLVM compiler, and this back-end has allowed the generation of asm.js out of our usual C++ (some restrictions apply, batteries not included). Moreover, such a compiler is not only possible, but it exists and is working: it is Emscripten.¹

Actually, the asm.js in the example above has been generated by Emscripten. Using Emscripten is indeed rather simple:² we just take our existing standard-compliant and not-using-platform-specific-stuff C++ code (hey, you DO write your code as cross-platform and standard-compliant, don’t you?), and compile it into asm.js. As long as our code is just ‘moving bits around’, it works near-perfectly (and what will happen when we need to interact with the rest of the world, we’ll discuss in the ‘APIs’ section below), producing asm.js code which looks similar to the example above.

¹ There are alternative compilers (formerly Mandreel, now cheerp) which compile C++ not into asm.js, but into other subtypes of compliant JavaScript; we’ll see in a jiff why compiling into asm.js is so important.

² After the usual jumping through the hoops to get stuff installed

Piece III – optimizations for asm.js

When looking at all the stuff above, a very natural scepticism goes along the lines of “Ok, this compiled piece of [CENSORED] stuff MAY work correctly, but how slow it is going to be???” And here is the point where the third piece of the C++-to-asm.js puzzle comes in. I’m speaking about asm.js-specific optimizations.

The thing is that with asm.js being this simple and restricted, it becomes possible to optimize it during a JIT compile. That’s it – we can have our cake (write in C++) and eat it (run it in asm.js with a reasonable speed) too!

As of now, all the four major browsers (in alphabetical order: Chrome, Edge, Firefox, and Safari ³) – at least try to optimize for asm.js. Results vary, but currently, most of the time, we’re speaking about a less than 2× performance degradation of asm.js compared to native C++(say, compiled with Clang) [Zakai14]. While comparisons with native C++ are difficult to find (which BTW does make me to raise an eyebrow), the few resources available seem to support this claim (see, for example, [AreWeFastYet17]). BTW, Firefox results listed by the link are of special interest – in fact, it manages to keep the performance of asm.js within a mere 20% of the ‘native’ performance – and while we cannot rely on such performance (hey, we don’t want to be restricted only to Firefox users), it still serves as an indication of what it is possible to achieve (well, if enough effort is spent on it).

BTW, one important property of asm.js is that

As asm.js is a strict subset of JavaScript – it will run even if there is no special support for asm.js in browser.

Sure, without special support asm.js will be pretty slow – but if we’re speaking about ‘glue code’, it still may fly even with asm.js support being unavailable/disabled.

³ Well, actually – WebKit

Restrictions

While Emscripten provides a full-scale and very usable environment, there are certain limitations due to the need to run from within browser. When you’re ready to go ahead with Emscripten, make sure to read [Emscripten.Porting]; the following is only a very short summary of the Emscripten restrictions and capabilities.

APIs

The most annoying restriction of Emscripten is (arguably) related to the provided APIs. First of all, we can use pretty much all the C++ standard libraries which don’t need to interact with the system – and that’s including STL (phew). boost:: libraries are not explicitly supported, but there are reports that some of them can be compiled too (not without some associated headaches); most of the header-only boost:: libraries are expected to work with Emscripten ‘out of the box’ (no warranties of any kind, batteries not included).

As noted above, libraries which interact with the rest of the world are a different story. Contrastingly, in general, all the stuff which we’d need to use on the client is present in the APIs; in particular, the following APIs are supported:

Network support (libc-style, non-blocking only(!))
File system access
Graphics (OpenGL ES – though it is better to restrict yourself to WebGL-friendly subset, as I’ve heard that emulation of the rest kinda suxx)
Audio, keyboard, mouse, joystick (SDL)
Integration with HTML5 (DOM, some of the events – including device orientation, touch, gamepad, etc.)

Threads and main loop

Due to the Emscripten runtime being run on a top of the JS engine, threading in Emscripten is quite limited from the point of view of a C++ developer.

First of all:

Unless we’re speaking about ‘Workers’, everything within our app happens within a single ‘browser main loop’

In practice, this means a few things:

Our app MUST adhere to the ‘event processing’ model (i.e. if our code blocks for a while, it means that the whole page is blocked).
- APIs are built in a way to help us with this; in particular, network access being non-blocking only, is a Good Thing™ from this perspective.

If we have our own infinite loop (event processing loop, game loop, simulation loop, etc.), we’ll need to break it and re-implement it on top of the browser main loop. It is NOT as bad as it sounds – see [Emscripten.BrowserMainLoop] for details
Handling replies to asynchronous calls (such as replies to our requests which are coming from the server-side) can be a headache. For an overview of non-blocking handling techniques in C++ (though not taking Emscripten-specifics into account), see [NoBugs16] and [NoBugs17].

Personally, I do NOT think that this is really restrictive; in other words, I am arguing to write the code in such an event-driven manner (which I like to name ‘(Re)Actor-style’) in any case, even when there is no Emscripten in sight. Very briefly – considering I have been arguing that having thread sync at app-level is evil for years now (see [NoBugs10] and [NoBugs15]) – going for a bunch of event-driven (Re)Actors exchanging messages is a Good Thing™.

Using multiple cores

While I am all for event-driven single-threaded processing, I am the first one to admit that there are situations when one single thread (and as a result, a single CPU core) is not sufficient to do whatever we need to do. Which means that we do need a way to use multiple cores.

However, being able to use multiple cores, DOES NOT necessarily imply the need to go into traditional mutex- and atomics-ridden untestable nightmare. Rather, we can have more than one separate event processor a.k.a. (Re)Actors (in Emscripten-speak, additional (Re)Actors – that is, beyond the original one running within the ‘browser main loop’ – are called ‘workers’) and exchange messages with them. It provides several benefits compared to classical mutex-based shared-state synchronization models:

There is no need to think about thread sync when programming.
- While it comes at the price of headaches related to handling non-blocking calls, I am arguing that – in those scenarios when we need to handle intervening events anyway – non-blocking single-threaded handling is the least evil; for more discussion, see [NoBugs17].

Each of the (Re)Actors is deterministic. This, in turn, enables several all-important improvements (from testability and replay-based testing, to production post-mortem analysis), see [NoBugs17] for a detailed discussion.
This approach is Shared-Nothing and, as a result, it scales near-perfectly (though see the note below). This phenomenon (and problems with scaling shared states) is well-known; very briefly, each and every shared state (in other words, every mutex) carries a risk of becoming a very serious contention, causing severe degradation of scalability; moreover, in quite a few cases you may find that 90% of all your processing happens under one of the mutexes, which means that regardless of the number of cores, you cannot possibly scale more than to 1.1 core.
- As discussed in [NoBugs17], the only case which I know when pure (Re)Actors-exchanging-messages are not scaling well is when we have a big unbreakable state with lots of calculations performed over it at the same time. This can be solved (and was solved for an AAA game Client too) without departing too much from the event-processing (Re)Actor-based ideology (using what I call (Re)Actor-with-Extractors). However, at the moment, (Re)Actor-with-Extractors is not supported by Emscripten, so there may be some issues on this way.

(Re)Actor-based systems tend to exhibit very good performance. Discussion of performance advantages of event-driven systems over thread-synced ones is well beyond the scope of this article, but very briefly, it boils down to the costs of thread context switches (which can take anywhere between 10K and 1M CPU cycles(!)), and event-driven systems tend to have much fewer of these switches. From a completely different point of view, there is a reason why event-driven non-blocking systems (such as nginx) tend to beat blocking systems (such as Apache) performance-wise.

Pthread support

In theory, Emscripten has support for pthreads. However, the support is experimental – and moreover, it is Firefox-only. This, of course, makes its use for serious projects a non-starter; however, my rant about pthreads goes deeper than that:

Even in the long run, I would prefer support for (Re)Actor-with-Extractors to support for pthreads.

Sure, having full-scale pthreads, we can implement (Re)Actor-with-Extractors ourselves; however:

I have no idea how difficult it will be to push pthreads into all the browsers (from what I’ve seen, it can easily become an insurmountable task). (Re)Actor-with-Extractors should be easier to implement (while providing all the safety guarantees – and testability too).
- In addition, at least in some cases, (Re)Actor-with-Extractors may happen to be more efficient (it depends on specifics of pthreads implementation under each of the browsers, but in general, it might easily happen)

Enabling pthreads would bring us back into dark ages of massive usage of mutexes – and as you may have noticed, I am a very strong opponent of mutex-based thread sync at application level. I prefer to keep my code clean in this regard.

64-bit int and 32-bit float issues

As of now, the only numeric data type in JavaScript is 64-bit float; in addition, some operations (mostly bitwise ones) return 32-bit integer (which always fits into 64-bit float). As a result, any operations which are neither 64-bit float nor 32-bit integer are not 100%-efficient in asm.js. In particular:

32-bit floats need to be processed as 64-bit floats, which is rather slow compared to native 32-bit floats
64-bit integers need to be simulated from 2 of 32-bit integers, which is pretty slow too.

There are some proposals to deal with it (see, for example, [Zakai14]) but as far as I know, these slowdowns still apply, so if you’re after best-possible performance, you need to keep them in mind.

Practical uses

As noted above, I haven’t used Emscripten for a serious project (yet). However, quite a few projects were reported as compiled and working, including:

Game Engines(!)
- UE3 (reported to be ported in 4 days)
- UE4
- Unity
  Unity is quite an interesting beast when it comes to its use of Emscripten; as it uses C# at the app-level, it first re-compiles C# parts into C++ using IL2CPP compiler, and then uses Emscripten to compile it into asm.js. You won’t believe it – but it does work.☹

Games
- Quake 3
- Doom
- OpenDune

Libraries/Frameworks
- OpenSSL
- SQLite
- Pepper (via pepper.js)
- Quite a few of Qt demos

For a much more comprehensive list of ports and demos, please refer to [Emscripten.PortingExamples].

Competition: NaCl/PNaCl

An alternative way of running C++ code on browsers, is NaCl/PNaCl by Google. It serves pretty much the same noble purpose of running C++ on the browser, however, it has the BIG problem of being restricted to Chrome. As (a) no other browser has followed suit, and (b) as Chrome market share, while it grew to about 60%, has slowed down its growth in 2016, I do NOT think that NaCl/PNaCl is a viable option (except for some very narrowly defined scenarios) – especially when comparing it to Emscripten+asm.js.

Moreover, I’ve got a feeling (no warranties of any kind) that Google itself has realized futility of (P)NaCl and has slowed down development as a result; overall, my wild guess is that in a few years from now, (P)NaCl will be quietly abandoned in favor of asm.js (and Google is already working on support for asm.js optimizations) or in favor of WebAssembly (see below).

As a result, while the only thing which is certain is that nothing is certain yet, if faced with the task of developing/porting a new C++ Client for browser, I would clearly prefer Emscripten+asm.js.

Oh, BTW – if you already have a (P)NaCl client, there is a library pepper.js, which aims to provide a migration path from (P)NaCl to Emscripten; while I didn’t try it myself – well, it seems to be worth trying.

Ongoing development: WebAssembly a.k.a. wasm

As a next step in this development (and to compensate for certain problems such as asm.js parsing times on mobile devices), an alternative representation – known as WebAssembly or wasm – is being actively worked on.

The idea is to use (give or take) the same C++ source code as already can be used to compile into asm.js, and to compile it to a very different assembler (wasm). Then wasm will be loaded into the browser, where it will be JIT-compiled and then executed.

There seems to be quite significant momentum behind wasm – but as of now, it is too early to tell anything specific. What matters though is that

As app-level developers, we do NOT really care much whether it is asm.js or wasm which wins in the end. Rather, we can use asm.js right now, and hope that we won’t need to change our programs too much when re-compiling them into wasm (when/if it is widely available)

Whether these hopes will stand in reality, we’ll see, but as of now, it is IMNSHO by far the best option we have to try pushing our C++ Clients into browsers.

Practical uses: porting downloadable clients to the web

Well, it is all this stuff is certainly technically exciting, but what can we get from it in practice? Most importantly,

we can port our (well-written-enough) C++ Clients to the web.

Until two or so years ago, there was no way to port an existing downloadable Client into a web app. In other words, whatever we were doing with our C++, we weren’t able to avoid download and at least some warnings about how malicious our code can be from the browser – and this was the point where our potential users were dropping out the most.

So, for a long while, when deciding how to develop our Client,

we were facing a tough choice: either to develop it in JS-only (losing all the bells, whistles, and performance of C++ development) – or to have it in C++ but at the cost of dropping those users who don’t want to download.

With Emscripten and asm.js, these problems are gone. We can have our C++ cake and eat it on browsers too.

In addition, such an option opens a door for some things that are not really widely used yet – such as creating live demo versions which can be viewed in-browser without the need to download and install them; it looks very promising for reducing drop-out rates of potential customers (as showing a live demo tends to work orders of magnitude better then showing a screenshot, and if we can get live demo without download, we have a clear winner).

Of course, to achieve this holy grail of multi-platform clients with one of the platforms being ‘web browser’, we’ll need to re-learn how to write cross-platform programs (and apparently, with all the vendor efforts to lock us in, it is not an easy feat), but as soon as we do it (and some of us were doing it all the way regardless of Emscripten), we will be able to have one single C++ code base over all of the following: desktops, phones/tablets, and web (with AAA gamedevs being able to add consoles to the mix too).

[+]References

[+]Disclaimer

Acknowledgements

This article has been originally published in Overload Journal #138 in April 2017 and is also available separately on ACCU web site. Re-posted here with a kind permission of Overload. The article has been re-formatted to fit your screen.

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.