Part VIIa: Security (TLS/SSL) of 64 Network DO’s and DON’Ts for Multi-Player Game Developers

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

We are about to finalize our epic multi-part article on network support for multi-player game engines.

Previous parts:

Part I. Client Side
Part IIa. Protocols and APIs
Part IIb. Protocols and APIs (continued)
Part IIIa. Server-Side (Store-process-and-Forward Architecture)
Part IIIb: Server Side (deployment, optimizations, and testing)
Part IV: Great TCP vs UDP Debate
Part V. UDP
Part VI. TCP

After going through all the topics above (especially those related to TCP and UDP, which has caused quite a reaction on Reddit), there is still one Big Fat Thing to discuss. It is the thing which is quite similar to morning exercise: everybody hates it, not that many people do it, and those who don’t do it – regret about not doing it later. I’m speaking about network security.

Upcoming part:

Part VIIb. Security (concluded)

51. DO Think about Security

First of all, you need to realize that your game engine DOES need this security, whether you like it or not. Even if your game engine targets neither stock exchanges nor casinos, chances are that you still need some kind of network security incorporated.

PCI DSS The Payment Card Industry Data Security Standard (PCI DSS) is a proprietary information security standard for organizations that handle branded credit cards from the major card schemes including Visa, MasterCard, American Express, Discover, and JCB.— Wikipedia —If, by any chance, the game allows for (or will allow in the future) in-game purchases which require entering credit card numbers – you need not just to deal with security, but also to ensure your software allows for PCI DSS compliant deployments, which is quite a headache.

Fortunately, most of the games today delegate handling of credit cards and payment methods in general (usually to some kind of an AppStore, but there can be other payment processors). However, while it absolves you from protecting credit card numbers (a.k.a. PANs – this is what PCI DSS is all about), protecting in-game stuff still remains your responsibility.

“Or (even worse for the company image) – these not-so-honest guys may want to “steal” these items from the legitimate owners.One very common pattern arises when a game has some in-game items that are difficult-to-obtain and are of value to the player (99% of modern multi-player games do have such items). If there is something-of-value-which-is-difficult-to-obtain – then given enough players, some among them will be not-so-honest. These not-so-scrupulous guys will try to get this “something of value” bypassing game logic. Or (even worse for the company bottom line) – without paying for these items. Or (even worse for the company image) – these not-so-honest guys may want to “steal” these items from the legitimate owners. And as soon as those not-so-scrupulous guys appear – you do need security. And as the most obvious attack vector, which these guys are most likely to use, is via the network – here we are, speaking about Network Security.

In this article we won’t speak much about “how to deploy system to be secure” (which is a very large and separate topic); rather, our scope will be on “how to write software that can be deployed in a secure manner”.

52. If Security is of ANY concern – DO use Authoritative Servers

This subject has already been touched in item #13 of Part IIb, but it is so important that we will repeat it once again here, in Security part of the article:

If you want at least some kind of security, forget about non-authoritative servers.

The reason is simple: with non-authoritative servers, the protection of the game from hackers is pretty much hopeless. While for security professionals it is obvious, it might be not that obvious for game engine writers, so here goes a brief explanation:

For non-authoritative servers, you’re bound to trust client apps.
The user has almost-perfect control of the client app. It means that the user can reverse-engineer client, can debug it, can manipulate it’s memory, etc. etc.
It means that the only kind of protection we can get for the client is “Security by Obscurity”.
“Security by Obscurity”, given sufficient number of attackers, generally provides very little protection. It is discouraged by standard bodies (and for a Good Reason), by security professionals, etc. etc. [WikiSecurityThroughObscurity]. The Big Reason behind (as it has already been mentioned in item #13 of Part IIb) is simple – you shouldn’t rely on outsmarting thousands of highly motivated hackers, especially when they’re playing on their own field (which is the case for the client.¹).

Bottom line: if there is even a slightest chance that your game will ever have something-difficult-to-obtain that has significant value for players – DO use authoritative servers.

“When making server authoritative, we're playing on our own field, where we establish the rules of the game.When making server authoritative, we’re still facing the same thousands of highly motivated hackers, but, unlike on the client-side, we’re playing on our own field, where we establish the rules of the game. Just one example: for a hacker, running debugger against your client is trivial. For the very same hacker, running debugger against your server is not (well, if your admins are doing at least somewhat decent job, but this belongs to “how to deploy system in a secure manner” topic, which we’re leaving out of scope for the time being). This (both in theory and in practice) makes the whole world of difference: while protecting client is next-to-impossible, protecting server is a game-which-can-be-fought-on-more-or-less-even-footing.

¹ There are cases when you as a game engine developer will need to rely on Security by Obscurity (notably – for the purposes of bot fighting, which is a subject for another article), but using Security by Obscurity for non-authoritative servers is very different for two reasons. First, (a) The impact of one single hacked client when non-authoritative servers are used is usually fatal (while the impact of one single bot is unplesant, but containable). Second, unlike with bot fighting, you can do better than Security-by-Obscurity with servers – just make your servers authoritative.

53. DON’T send money-related transactions unencrypted.

This one might look obvious, but apparently unencrypted money-related transactions do happen in real world. In short – DON’T DO IT. NEVER EVER. If the unencrypted transaction is a credit card transaction – you can be hit Really Badly by PCI DSS (this kind of negligence may easily cost the game company an ability to process credit cards at all). But even if the transaction is not credit-card related – the encryption is still a must.

For all the money-related stuff, use TLS (and not with ADH, see item #55b below!) over TCP or over reliable UDP (and don’t mess with DTLS described in item #56 below, for money-related stuff, it is quite easy to make a mistake when handling packet loss/reordering, and you don’t want to make mistakes when it comes to money). It is not a rocket science, but it needs to be done, it is as simple as that.

Now to the question: “what exactly is qualified as a money-related transaction?” My advice in this regard is the following:

As soon as somebody can get something-convertable-to-real-world-money by hacking your transaction – treat it as money-related and encrypt it.

A simple example: if the transaction in question is about trading your own in-game ‘credits’ for some in-game items, and those in-game ‘credits’ can be purchased for real-world money – to be on a safe side, you SHOULD treat the transaction as a money-related one.

Another example is a bit less obvious: let’s consider the transaction which is related to a trade of an in-game item, where such an item cannot be purchased for real-world money. Does it mean it is safe to leave it unencrypted? Not necessarily – for example, if this is a trade of the in-game artifact that cannot be purchased in-game, but which can be sold on a secondary market (such as e-bay) for thousands of dollars, such transactions still make a very juicy target for hackers, and therefore, must be protected (you can rely on players being outraged when such an artifact is stolen because of lack of security; heck, players tend to be outraged even if such a thing is stolen because of their own fault).

Spoofing IP address spoofing or IP spoofing is the creation of Internet Protocol (IP) packets with a forged source IP address, with the purpose of concealing the identity of the sender or impersonating another computing system— Wikipedia —A side note to those who will say that attacks on unencrypted-TCP require spoofing, and therefore are generally difficult-to-mount, and therefore are impractical. First of all, spoofing is not the only way to attack an unencrypted TCP channel (for example, any kind of DNS attack will open the door for a Man-In-The-Middle attack on an unencrypted TCP channel without spoofing). Second, spoofing itself is easy-to-mount under certain conditions (for example, when both attacker and victim share the same network segment, which can happen).

53a. Perceptional Security

And last but not least, there is a perceptional aspect of encryption:

If you don’t implement encryption, then even if your system is not broken in reality, a whole lot of the players will assume that you’re insecure, and will complain about lost items (including those items which have never existed in the first place).

While players will complain regardless of your engine implementing encryption, not having encryption will give them a valid reason to complain, which makes things much worse. If nothing else, writing that „we’re using industry-standard 128-bit TLS encryption for all our transactions“, while being quite meaningless to security professionals (and for a good reason, as there are many ways to implement it wrong), will still help to answer some of those complaints.

Overall, from my experience, having system with is perceived to be secure, unfortunately, is way too often considered to be more important (or even being the only thing that really matters). This kind of approach usually arises when management just wants to be able to write “we’re doing 128-bit encryption” on the web site, and developers have no clue about security and doing minimum to be able to write that phrase. This approach has been observed as extremely short-sighted, and can easily cost the company its business if the software is hacked badly. However, perceptional security also has it’s place in the company bottom line (from business perspective, it is not enough to make the system secure, it is important to be able to make users believe that the system is secure).

On the bright side of the things, having a system which is perceived to be secure, doesn’t prevent you from making it really secure. This is what you MUST aim for.

54. DON’T Design Security Protocols Yourself

WEP Wired Equivalent Privacy (WEP) is a security algorithm for IEEE 802.11 wireless networks. Introduced as part of the original 802.11 standard ratified in 1997, its intention was to provide data confidentiality comparable to that of a traditional wired network— Wikipedia —Once upon a time, there was a bunch of IEEE guys who needed to develop a new security protocol for Wi-Fi. They did understand the need to develop the very best security protocol for Wi-Fi, aimed it to be really secure, and even named it “Wired Equivalent Privacy” (WEP).

The problem with this protocol was that those IEEE guys have made quite a few mistakes in the protocol design. One rather poor design choice was sharing the same key across the whole network, which is quite a bad practice to start with (though here they’ve had a more-or-less valid excuse of end-user convenience). Another poor design choice, which has lead to critical security problems, was related to extremely poor initialization-vector (IV) choices for the RC4 stream cypher (from IV size being too small, to fatal underspecification of “how IVs should be generated”); this has lead to IVs being repeated, which is a Big No-No for RC4 (and any stream cipher), and leads to XOR of two-packets-encrypted-with-the-same-IV to be the same as XOR of two-unencrypted-packets; this opens a Really Big Hole in WEP. Yet another problem with WEP was using CRC instead of crypto-integrity-protection (and no, encrypted CRC doesn’t guarantee integrity); when combined with using RC4 (or any other stream cipher for that matter) this allows attacker to mount a so-called bit-flipping attack allowing the attacker to modify packets in a way that modified packet passes CRC check, even without knowing what’s inside (granted, in such an attack the attacker doesn’t know what exactly the modified packet means, but an ability to throw in the modified packets which pass as valid ones, is another Really Big Security Hole). As a result of the above weaknesses (most of which, though not all, should have been envisioned while WEP was designed), as early as in 2001, there was a practical attack on WEP (known as “Fluhrer, Mantin and Shamir Attack”), and by 2007 the attack has been improved to allow breaking WEP key in as little as 1 minute [Schneier2007].

“It is certainly not worth the troubleI certainly don’t intend the paragraph above to be a comprehensive analysis of the WEP failures (and even less intend to blame those guys who designed it); my task here is very simple – to demonstrate how many not-really-obvious things can go wrong when designing a security protocol, and to convince you that it is certainly not worth the trouble. As you can see, there are lots and lots of very subtle things which can (and will) go wrong with secure protocol design. And while WEP is one of the worst crypto-protocols ever designed, it is certainly not the only one. For more examples see, for instance, a talk by Prof. Daniel Bernstein [Bernstein]. His presentation is full of crypto-specific details which we don’t really need to understand at this point; however, what is really important for our purposes, is that

Design of secure protocols is a thing which even security professionals have lots of problems with.

One of implications of this observation is that, for game engine developers (who’re rarely security professionals), developing a new secure protocol will be much more likely to be a disaster than not.

55. DO use TLS both for TCP and for reliable-UDP

So, you shouldn’t design security protocol yourself. But what should you use?

TLS Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), are cryptographic protocols designed to provide communications security over a computer network— Wikipedia —TLS (Transport Layer Security) is an industry standard which, despite all the more-or-less recent issues widely reported over the Internet (most of them already patched), is pretty much the best thing which is widely supported out there. The only not-so-trivial thing TLS requires, is a reliable channel. It means that TLS will work both over TCP sockets and reliable-UDP libraries. For example, it is possible to make a TLS exchange on top of Unity3D’s RPC calls (expect quite a few round-trips when doing it this way; if this is unacceptable – use DTLS over unreliable-UDP, see item #56 below).

55a. DO use OpenSSL

TLS itself is just a standard; to use TLS, you need a library implementing it. TLS is that complicated, that writing such a library on your own is not really an option. Fortunately, there is one free (both as in “free beer” and “free speech”) library which implements TLS, it is [OpenSSL]. Once again, despite recent (and extremely unpleasant) HeartBeat attack, OpenSSL is a very solid library, and I recommend it over anything else. If your game is Windows-only, you might want to use WinInet API (which does support TLS); however, I’d still suggest to use OpenSSL over WinInet (even if currently all your players are Windows-only, you never know where your game will need to be ported a few months later).

55b. DON’T use ADH as TLS “Cipher Suite”

TLS allows for different ways of data encryption-and-authentication (“cipher suites” in TLS-speak) to be used; the problem is that there are soooo many of them, that choosing one becomes a problem. Most of these do pretty much what you would expect from their respective names (taking into account their respective key lengths), with one notable exception, Anonymous Diffie-Hellman (ADH).

“Once I’ve seen a competitor’s game with millions of real-world dollars transferred via their app, and this app used TLS cipher suite known as Anonymous-Diffie-HellmanOnce I’ve seen a competitor’s game with millions of real-world dollars transferred via their app, and this app used a TLS with a TLS cipher suite known as Anonymous-Diffie-Hellman (ADH) (and they’ve even wrote about it in their ‘About’ box). There is one tinsy problem with this cipher suite: it doesn’t provide protection from man-in-the-middle-attack and is therefore insecure regardless of how many bits are used for the encryption. OpenSSL disables it by default, but those guys have overridden defaults to enable ADH.

Man-in-the-middle Attack Man-in-the-middle attack (often abbreviated to MITM, MitM, MIM, MiM or MITMA) is an attack where the attacker secretly relays and possibly alters the communication between two parties who believe they are directly communicating with each other.— Wikipedia —One scenario which breaks ADH in practice, is some kind of DNS attack, which redirects client traffic to malicious server; this server poses as a legitimate server to the client, and as a legitimate client to the real server, forwarding all the traffic but being able to see/modify it on the way. It is a very classical man-in-the-middle-attack; for cipher-suites other than ADH, there is a server certificate which can (and MUST, see item #55e below) be authenticated by the client. Such verification ensures that client speaks with the correct server, and therefore prevents the attack; however, with ADH being certificate-less, there is nothing to check, and therefore there is no way to distinguish the real server from the fake one.

Quite simply – just DON’T use ADH. If you need security (and have no clue about it) – at least use OpenSSL defaults. And while ADH can simplify life for your admins (as you don’t need certificates for ADH), it cannot be considered secure (in short – ADH is inherently insecure exactly because you don’t need certificates for it).

55c. DO prohibit TLS < 1.2, and DO disable All the Unused Cipher Suites

When somebody like Apache team builds a web server, they need to support all the reasonable TLS versions and all the reasonable TLS cipher suites (ADH being one exception, it is that unreasonable that it is disabled by default); this is necessary because there are tons of different browsers out there.

“And in the security field, if you can disable something unused – you SHOULD do it (formally, it reduces attack surface, which is a Good Thing).On the other hand, when you build your own game with your own client and server, you can make sure that only one cipher suite is used, disabling all the others. And in the security field, if you can disable something unused – you SHOULD do it (formally, it reduces attack surface, which is a Good Thing). One very recent example of unused-stuff-being-abused is so-called “downgrade attacks” such as [FREAK] and [Logjam] attacks. As for older versions of TLS (this includes SSLv3 and SSLv2) – you SHOULD disable them for even more compelling reason – because they have vulnerabilities.

Bottom line: choose one cipher suite (such as TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256; further discussion on cipher suites is well beyond the scope of this article²), enable only it on both client and server, and don’t forget to disable such stuff as SSLv2 (usually already disabled, but it never hurts to check), SSLv3, TLSv1.0, and TLSv1.1 (to do it, look for SSL_OP_NO_TLS1_1 and similar flags for SSL_CTX_set_options(); for the example of calling SSL_CTX_set_options(), see [OpenSSLExample]). The whole will take just an hour or so (to figure out how and where to do it), but it will save you a LOT of trouble down the road.

² for stock exchanges you’re probably better with more conservative TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384; it is slower, but as of now is usually considered more secure

55d. DON’T Use RC4

“Do yourself and everybody else a favour, and DON'T use RC4.Even in year 2015 there still exist people who argue that “RC4 is the best crypto-algorithm” (probably based on the 20-year-old benchmarks which show that it is faster than 3DES). Do yourself and everybody else a favour, and DON’T use RC4. RC4 is outdated and has very poor security even as a part of TLS (and for what may happen if you’re using it outside of TLS – see item #54 above).

If speed is a concern (and for games it usually is), use Chacha20 instead of RC4. First, Chacha20 is significantly faster than RC4 (unfortunately, I don’t have direct OpenSSL benchmarks in hand, but [Crypto++] shows that Salsa20 – predecessor of Chacha 20 – is about 3x faster than RC4, and [eBACS] shows that Chacha20 is a tad faster than Salsa20). Second, Chacha20 is considered significantly more secure that RC4 (at least there are no known attacks against Chacha20, which we cannot say about RC4). If you’re really concerned about security (for example, your game being a stock exchange) – use good old AES (preferably AES-256) instead, it is not really likely that for a stock exchange you will be able to notice the performance difference between AES-256 and Chacha20, and in stock exchange world AES has significantly better perceptional security.

55e. DO Verify Server Certificate on the Client

“For TLS to work properly, client app MUST validate that the server the app is speaking to, is the server which the app expects to speak to.The most secure protocol is useless, if it is used not as it what it has been designed for. For TLS to work properly (and to avoid those man-in-the-middle attacks similar to the one described for ADH in item #55b above), client app MUST validate that the server the app is speaking to, is the server which the app expects to speak to. This is done by so-called “certificate verification”; the process of doing this verification for OpenSSL can be found in [OpenSSLExample].

One thing to note in this regard is that for verification to work, you need to let your client know what kind of servers it is supposed to speak to (otherwise all the servers will be indistinguishable). This is done by (a) embedding your trusted “root”/”CA” certificate into the client; (b) feeding this “root certificate” to a function such as SSL_CTX_load_verify_locations() (see [OpenSSLExample] for an example). And yes – this is exactly what browsers do to work with HTTPS: they have embedded (as a part of the browser install) a list of certificates of trusted entities (such as Verisign,Thawte,Equifax,and dozens of others), and these entities are considered trusted to issue server-certificates-which-browser-will-allow-connections-to.

Note that as long as you’re embedding your certificate into your client, it is ok for your CA to have so-called “self-signed” certificate.³

³ If you don’t understand what it means – that’s fine; however, knowing it will help you to generate your very first server certificate.

55f. DO Consider Static Linking of OpenSSL

When using OpenSSL for games, I recommend to consider static linking of OpenSSL library on the client-side. I know that I will be (once again) beaten black-and-blue for this suggestion, this time by security guys. They will say that “if you static-link OpenSSL, you won’t be able to benefit from OS updates, which are always faster and better”, and they will have a point. However:

If you’re updating your game frequently enough (which is normally the case for games-which-are-still-actively-used), and you’re updating your OpenSSL when you’re updating your game, this argument doesn’t apply much (in fact, you may update your OpenSSL earlier than it would happen for an OS-provided-OpenSSL)
As it has been observed in Debian RNG Disaster ([WikiDebianRNG]), OS-provided OpenSSL do have a chance to be much worse than the original one
On Windows, you cannot really use OS-provided OpenSSL, and will need to provide your own regardless, so the whole issue of using-OS-provided-SSL becomes moot

“When we're speaking about games, there is potentially one Big Consideration in favour of statically-linked OpenSSL: it is bot fighting.On the other hand, these considerations are fairly minor, and are presented only to illustrate that the answer to the question “use your own OpenSSL or OS-provided one” is arguable even when speaking about security in general.

When we’re speaking about games, there is one thing which is usually much more important than considerations above (and which may be one Big Consideration in favour of statically-linked OpenSSL): it is bot fighting. While bot is fighting as such is beyond the scope of this article (and I hope to write about it in the future), using OpenSSL as a DLL/.so it is one of the Most Popular Attack Vectors for bot writers, when they’re attacking TLS-based games.

When you have OpenSSL as DLL/.so, reverse-engineering your over-the-wire protocol is a cinch: just intercept calls to a DLL (exact mechanics of intercepting calls is beyond the scope, but is perfectly doable and is very-well known in hacking circles), and bingo – you have all the exchange unencrypted. While the whole bot fighting task is obviously in the realm of “Security by Obscurity”, in practice it is a Really Big Problem for many games, and is known to be solved with certain efficiency; in this fight, having OpenSSL as a DLL/.so has been observed to be a Really Big Hole. “Oh, they’re using openssl.dll, so writing a bot for them is easy” is a recurring phrase at least in some of the bot-writing circles.

55g. DO be careful when using Compression + Encryption

When encrypting an already compressed channel, and your users have a way to inject some data which they know (which happens all the time when dealing with such things, as chat) – compression may become a problem 🙁 . At least in theory, under these conditions an attack can be mounted, similar to [CRIME] and [BREACH] attacks.

“For example, you can eliminate potential for this class of attacks by compressing 'public data' and leaving 'private data' uncompressedIn practice, it is not as bad as it may sound for your usual game; first of all, this class of attacks only affects confidentiality, and for many games there isn’t that much confidential information (integrity and authentication being of much more concern than confidentiality); in particular, if all the data within the game is public – these attacks do not apply. Second, the cost of mounting such attacks will be really high (at least in terms of time spent on mounting the attack), and probably not worth it for the attacker.

However, if your game is Really Security-Critical (such as a stock exchange or a poker site) – then you probably better to address this attack vector, but (unless you’re ready to give up compression at all) it won’t be easy 🙁 . For example, if you have clear separation between “public data” and “private data”, you can eliminate potential for this class of attacks by compressing public data and leaving private data uncompressed (you will need to have some kind of the “frames” at the level between compression and encrypting, with some “frames” indicating that they’re compressed, and some indicating that they’re not). If you need to compress private data too, it can also be done in a completely safe manner, provided that source of all the data within one single compression channel is the same.

56. DO use DTLS for Unreliable-UDP

If you don’t have a reliable channel – don’t worry too much; there are existing datagram-oriented protocols out there. For example, [SPINS] and supposedly-security-equivalent [SASP] are very simple protocols which are based on symmetric keys; however, using them means that you’ll need to implement symmetric key exchange yourself, which is a headache, and also can be a source of security mistakes/bugs, so I cannot really recommend them for game engine purposes (especially as there is a better alternative, see below). It is not becasue these protocols are bad, but because integrating them into game engine has too much potential for making a mistake-which-wont-manifest-itself-until-it-is-too-late.

Fortunately, there is a kinda-port of TLS to datagram-oriented protocols, including UDP, it is named [DTLS]. It is more or less TLS-over-UDP, and OpenSSL supports it, so it can be considered pretty much solid “by design”. When using DTLS, keep in mind that DTLS session de-facto consists of 2 parts: the first part is a handshake, which is essentially a TLS handshake where DTLS more or less implements a “reliable UDP” just for the purposes of the handshake; it implies retransmits (implemented by DTLS itself “under the hood”), delays in case of lost packets, etc. However, as soon as handshake is established, DTLS starts working as-we-would-expect-from-UDP: i.e. packets are independent, if the packet is lost – it’s lost, and most importantly, there are no delays due to retransmits.

To be concluded…

Tired hare: To avoid being too much of TL;DR, Part VII of our article, Security, has been split. Stay tuned for the last instalment in this network-for-game-engines series.

EDIT: The series has been concluded, with the following parts published:
Part VIIb. Security (concluded)

[+]References

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.