MOGs: Hacks and Hackers - IT Hare on Soft.ware

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

Hackers attacking! IDA Pro, Cheat Engine, Hexinator, WinAPIOverride

[rabbit_ddmog vol=”8″ chap=”Chapter 29(b) from “beta” Volume VIII”]

If you know the enemy and know yourself

you need not fear the results of a thousand battles.

— Sun Tzu, The Art of War, circa 5^th century B.C. —

Before we even start devising how to protect from a certain threat, we have to realize what (and who) exactly we’re facing; this is especially true when dealing with hackers. Otherwise, we’re risking to fall prey to oversimplified-solutions-which-don’t-really-work such as “hey, let’s ‘encrypt’ our code, that’s all we’ll ever need to do”.

These days, our primary adversary is somebody who has already hacked a dozen of competitor’s games, and he is keen to add our game to the ever-increasing roster of his wins. For the time being, we can ignore whether he is doing it for money, or just for fun; this, while being important in the Grand Scheme of Things™ (especially if we’re speaking about “when he decides to give up” and “how it is possible to get his hack to defend against it”), doesn’t change technical means for the attacks too much.

“we (as MOG developers) do NOT care about hacks by government agencies and antivirus companiesWhat is clear though is that we (as MOG developers) do NOT care about hacks by government agencies and antivirus companies;¹ while, strictly speaking, the latter folks are indirectly in the picture (as in the course of their work against malware they can produce methods and tools which defeat anti-reverse-engineering protections), they rarely share their methods with sufficient-to-use-in-practice-details publicly (because of the inherent danger of their published methods being analyzed and counter-acted by malware writers) <phew />.

¹ if these folks want to hack us for whatever reason – they will do it regardless of our protection; moreover, as for the former, we’d better to be hackable by them so they won’t use other means to get the information they need.

A Bit of History

Back in early 2000s, the whole thing was in its infancy; moreover – pretty simple techniques were sufficient to throw attackers off the scent. In [Dodd], they describe how they managed to do it by adding a checksum to their game, so they can be sure that the game is not modified. Yes –

back in early 2000s even such simple solutions were enough to delay hackers for 2 months.

Since then, attacker tools improved greatly. In particular, an advent of the concept of “interactive disassembling”, represented by IDA Pro hacking suite, has helped attackers a Damn Lot™. While the attackers did improve their attacks, changes on defending side are left much more modest, and are mostly still trying to deal only with a DRM-inspired question of “how to hide the real code from the prying eyes” – and with only very limited success too.

This, in turn, has lead to a situation that as of 2017,

All the popular protection methods lag well-behind capabilities of the dedicated attacker <sad-face />

Still, the hope is not lost – it is just we shall look beyond what-is-popular (looking for weaker spots within the attacks).

Hackers: Tools of the Trade

These days, a serious attacker comes to break our C/C++ game² armed with the following set of tools:

IDA Pro. IDA Pro can be seen as a comprehensive suite designed exactly to hack executables. It provides both static analyzer, and debugger; while different attackers MAY prefer to use an alternative debugger, for static analysis IDA Pro is the tool of the hacker’s trade.
- Basic stuff such as flow graphs, custom loaders to deal with “encrypted” executables, support for structs/unions, etc. etc. is already there.
- It is interactive too – after it has done its job, you can say “hey, I identified this function (structure, union, you-name-it)” – and it will use information-you-found, for all the code. This, in turn, improves readability and allows to identify things further.
- “One of the nastier-for-us features of IDA Pro is so-called F.L.I.R.T. One of the nastier-for-us features of IDA Pro is so-called F.L.I.R.T. The idea of F.L.I.R.T. is nothing magic – they just have a list of antivirus-style “signatures” of well-known functions, and as soon as they see a “signature” of a function – they name it to show that “hey, this is actually an openssl_get_publickey() function!”. This alone can help to break our nice monolithic-and-unreadable executable into manageable-for-hacker pieces <sad-face />.
- If you’re about to do bot fighting yourself – make sure to read [Eagle]; feeling the ways hackers think is of extreme importance to write efficient defences.
As for hacking-oriented debuggers, the choice is wider than just IDA; among those which are seen to be used in the wild, besides IDA, there are OllyDbg, Cheat Engine, and of course, WinDbg.
- What we can rely on, is that whatever-debugger-is-in-use, on each step/breakpoint, it will provide the attacker at least with the following information:
  - Stack frames of the functions which are currently on the stack.
  - RTTI information (if present). This includes no less than those-class-names-right-from-the-source-code(!).
  - VMT A virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding).— Wikipedia — Even if RTTI is disabled, they will identify those-classes-with-virtual-functions via their VMT pointers (lacking names, but if hacker has identified such a class once – debugger will identify it further automatically based on the VMT pointer).
  - Let’s note that while this information is highly compiler-dependent, any decent debugger will report it for any popular-enough compiler.

In addition, there are lots of OS-level tricks which attacker may use to gather information about our program and/or modify its behavior in desired-for-them ways. In addition to static analysis and debugging, such tricks include:

DLL injections
Hooks
Blue Pill Blue Pill is the codename for a rootkit based on x86 virtualization.— Wikipedia —And if somebody gets really adventurous when fighting your sophisticated detection – they can go all the way into root kits, and even further into Blue Pill/Red Pill-style thin hypervisors.

BTW, we can (and often should) try to detect such intrusions and take counter-measures; however, as a Big Fat Rule of Thumb™, it will stop only relatively-novice attackers (which still has its merits, but is far from being a comprehensive protection).

As a side note: for Wirth’s sake, please don’t implement your protection as a single call to IsDebuggerPresent() and think that you already protected your executable from debugging; such “protections” are disabled way too simply, and anywhere-reasonable protection is much more sophisticated than that. Overall, the only valid reason I know to have IsDebuggerPresent() in your code – is to show a dialog box notifying the hacker that he violates your ToC (which might have positive implications from legal point of view – though make sure to discuss it with your legal department). Still, it doesn’t count as protection; the point where you wrote it, is not the end of writing protection, but rather the very beginning. For real protection stuff – keep reading, as we’ll see, there are LOTS of different things to do in this regard.

² Hacking games written in other programming languages is different, but generally is easier (usually – much easier <sigh />), see also Vol. II’s chapter on Client-Side Architecture

Most Popular Attack Vectors

Now, after we described the tools hackers are armed with, we can discuss the most popular attack vectors on the game Clients. There is a long list of such attacks, so to put them into some kind of perspective I’ll try to categorize them; on the other hand – please keep in mind that this categorization is neither strict nor wide-accepted, and should be taken with even a larger pinch of salt than usual.

Attacks on Silly Deficiencies

There are quite a few attack vectors which exist only because we (as developers) allow them to exist; still, if we’re silly enough to allow them – we’ll be hacked (and I won’t even blame hackers for doing it). Out of these, the most popular rather-silly-from-anti-bot-perspective things which game developers happen to be very prone of doing, are the following:

“The worst thing about proxy bots is that they can be made fundamentally undetectable. Leaving traffic unencrypted. This enables all kinds of proxy bots, which sit between our Client and our Server, and sniff/modify things as they wish. The worst thing about proxy bots is that they can be made fundamentally undetectable. When we’re dealing with bot-which-sits-on-the-Client-box – we have a fighting chance to detect it; however, for proxy bots we’re denied even this fighting chance.
- Unfortunately, just encrypting our traffic (as we’d do it for a web browser or any other app) is not sufficient for games. The next silly thing in this regard, is using DLL for encryption (it can be standard APIs such as sspi.dll or 3^rd-party DLLs such as openssl.dll). Very briefly – there is no chance to hide DLL calls, which means that we’re giving the attacker our unencrypted protocol (and therefore, an ability to write a proxy bot) on the plate. Moreover, such attacks can be mounted even in real-time easily (see, for example, [Aiko])
  - After we eliminate DLLs from the picture – we’ll still be potentially vulnerable to F.L.I.R.T., but fighting it goes beyond simple removal of the silly stuff (more on it in [[TODO]] section below).
- MITM In cryptography and computer security, a man-in-the-middle attack (MITM) is an attack where the attacker secretly relays and possibly alters the communication between two parties who believe they are directly communicating with each other.— Wikipedia — Moreover, if we’re silly enough not to do certificate check in the Client, or if we’re using Anonymous Diffie-Hellman for communications³ – attacker can mount a classical MITM attack, setting up a crypto-proxy which essentially pretends to be a Server to the Client (this connection will use the attacker’s own pair of public-private keys, which will work because of the lack of certificate checks on the Client-Side). At the same time, crypto-proxy will pretend to be a Client to the Server; this will allow crypto-proxy to forward all the traffic in both directions. As a result – attacker will be able to get his hands on our unencrypted protocol, allowing him to sniff/modify it to his heart’s content.
Using standard APIs to draw important text. The worst thing I’ve seen in this regard, was a bunch of poker apps using standard Windows controls to show the chat window. For poker Clients, chat window includes information such as cards dealt, player actions, and so on, so using standard controls for this sensitive information automatically enables both bot writing and data mining, obtaining the current state of the game as simple as by a call to GetWindowText().
Having one point where our protection can be disabled. This is especially silly when using obviously-visible standard API functions such as IsDebuggerPresent(); this combination enables very simple mountable-in-5-minutes attacks such as the one shown in [DutchTechChannel].

³ yes, I’ve seen a successful-game-doing-it(!)

Specific Attacks

In addition to exploiting silly things, there is a bunch of the attacks which go after one very-specific aspect of the game; such attacks are rather game-specific, but there are still some common attacks, including the following ones:

Texture replacement, which is used to make walls transparent, make opponents better visible against the background, etc. etc. To deal with it – keeping track of texture checksums would do the trick (but we still need to make LOTS of efforts discussed below to make sure that these texture checksums are protected and checks are not disabled).
“Even if we didn’t make any of the silly crypto-related mistakes discussed above, attackers can still try getting their hands on our unencrypted protocol, mounting a self-MITM attack.Self-MITM attack. Even if we didn’t make any of the silly crypto-related mistakes discussed above, attackers can still try getting their hands on our unencrypted protocol, mounting a self-MITM attack.
- Self-MITM is a variation of a classical man-in-the-middle (MITM) attack which was briefly discussed above; as a rule of thumb, self-MITM goes along the following lines:
  - Just as for a classical MITM, attacker generates his own pair of private-public keys for TLS encryption.⁴
  - Attacker finds our certificate-stored-in-Client which is normally used to prevent classical MITM. Then, within the Client, he replaces our certificate (or public key) with his own one (the one generated above).
  - Bingo! From this point on, he can make a crypto-proxy and mount MITM attack (even as our Client checks the certificate, it will check against an attacker’s certificate, enabling MITM).
- BTW, using OS-installed certificates to validate TLS connection is usually even worse than having a certificate embedded into our Client (because it is usually trivial to add a root certificate to the list of OS-recognized ones).
- To protect from self-MITM, we (after avoiding doing all the silly things discussed above) have to hide our certificate better, and to check its checksum too (again, hiding the checksum and checks well). As for “how to hide things” – this is going to be a huuuge subject, discussed in sections starting from [[TODO]] below.

⁴ BTW, even if you’re not using TLS, the attack still stands

Mother of All Hacks – “Brute Force” Reverse-Engineering

And last but certainly not least, we should mention “the mother of all hacks” – which is what hackers will usually do when none of the simpler ways (discussed above or otherwise), work. In a sense – it is a full-scale reverse engineering (of that-stuff-which-hacker-needs-at-the-moment). This class of attacks applies to any program, but on the other hand requires lots of effort; that’s the reason why I tend to name this family of techniques “brute-force reverse engineering”.

One common attack scenario which is next-to-impossible to prevent completely (but is still possible to delay quite a bit), goes along the following lines:

“As we’re writing an MOG, we have to call system-level socket functions (such as WinSock functions), there is no way around it.As we’re writing an MOG, we have to call system-level socket functions (such as WinSock functions), there is no way around it. With this in mind, attacker can identify our calls to such socket functions (as we’ll see below, there is no chance to hide system function calls completely, and even partial hiding can prove to be problematic as it might trigger AV heuristics to report us as “bad guys”).
- Alternatively, the attack can start not from a system call, but from a standard library function identified by F.L.I.R.T. (with whatever-SSL-library-you’re-using, being one of the nastier targets).
Attacker gets call stack at the point where WinSock (or F.L.I.R.T.-ed) function is called.
Then, he traces the stack all the way up to the interesting part which causes the call.
- In the process, to identify “what is interesting”, he has to use whatever structure/VMT/RTTI data he has access to, to understand the data we’re dealing with in those-functions-currently-on-the-stack.
  - This happens to be of paramount importance for the attackers; in fact, I don’t know of any successful attack which doesn’t use any of such data-related information to hack the program via brute-force.
  - In particular, message formats are of special interest to the hackers. I lost count of how many games were hacked using “the first 2 bytes of the message represent message type” convention. BTW, having the first field in your class Message as int type is also very popular – and is very vulnerable too <sigh />.
- Then, attacker dissects that function-of-interest, and modifies it (or gets protocol information out of it) to get the desired result.
  - Once again, data analysis (such as structures, VMT pointers, RTTI) happens to be of paramount importance in this process.

In a sense, brute-force is the “ultimate” form of attack which can be tried regardless of whatever-we-do. Moreover, given time – any code will give up to the persistent and proficient attacker. However, as we already mentioned above – for MOGs, there is a way to deny this time to the attacker (so that by the time when he’s past our defences, the time is up and he has to start anew); how to do it – will be discussed over the course of the rest of this chapter.

[[To Be Continued…

This concludes beta Chapter 29(b) from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”.

Stay tuned for Chapter 29(c), where we’ll start discussing practicalities of anti-reverse engineering protection]]

[+]References

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.

Comments

Dahrkael says

November 30, 2017 at 5:10 pm

One of the nasty products from Hex Rays that makes obfuscation difficult is their assembler decompiler into almost working C code. This lowers the bar *a lot* regarding static executable analysis letting way more people to tinker with the game logic. No way around it as far as I know

- "No Bugs" Hare says
  
  December 1, 2017 at 10:42 am
  
  > No way around it as far as I know
  
  Please wait until a discussion on Declarative Polymorphic Code+Data Obfuscation which will end up spread all over the code (coming soon)… VERY briefly – if we can generate 500 pieces of obfuscation code on each build, which 500 pieces will be spread all over the code – finding real stuff in these will become extremely difficult (and if 500 isn’t enough – 5000 will do it almost for sure). And if adding new obfuscation is just replacing type int with type obf<int> – well, adding those 5000 pieces of obfuscation won’t be difficult (and won’t kill readability of the source code either(!)). Adding an observation that for an MOG we’ll re-build the whole thing once per 2-3 weeks anyway (and that with a technique above, all 500-5000 obfuscations will be re-generated at no cost to us, but at huge cost to the attacker) – the whole thing does turn the tables.
  
  Of course, it should be accompanied by other stuff (such as force-inlines, templates, and shortened call stacks); in particular force-inlines and templates tend to make C-code-which-can-be-reconstructed-from-disassembler very different from original C++ code, and this doesn’t help attacker either (to put it mildly). With force-inlines, I can keep source readability, while making 1K-size functions in binary – and reading those 1K-monolithic-monsters will be pretty bad even with reconstructed-C-code (especially if 70% of it is obfuscation code – and with Data-level obfuscation, it tends to look _really_ nasty).
  
  Stay tuned (this part is expected to come within 2-3 weeks or so)…
  
  - Dahrkael says
    
    December 1, 2017 at 11:49 pm
    
    Sounds really nice for sure, but how bad would that hit in performance?
    Being 70% obfuscation sounds like a lot of lost cpu cycles!
    
    Looking forward to that part
    
    - "No Bugs" Hare says
      
      December 2, 2017 at 9:02 am
      
      > Sounds really nice for sure, but how bad would that hit in performance?
      > Being 70% obfuscation sounds like a lot of lost cpu cycles!
      
      Good question, but (fortunately for me) I have a good answer to it ;-). As a Big Fat Rule of Thumb(tm), we DON’T need to obfuscate performance-critical stuff such as vertex-level code etc. (at the very least, retrieving information from vertexes is a major challenge by itself, and in all my collection of real-world attacks, I don’t have any vertex-level ones). What we DO need to obfuscate is communication protocols, and game/business-level decisions (these are very local, and very easy to get valuable information from); however, as these non-graphics-related things usually take <1% of overall CPU cycles, increasing them even by 3x won't break our CPU bank ;-).
      
      Or, more generally: (a) 5% of the code tends to take 95% of the CPU cycles (this is common for ANY project); (b) it happens that the most-performance-critical-code is already in the least need of obfuscation. Hence, it is sufficient to obfuscate those 95% of non-performance-critical code.
      
      Disclaimer: it is NOT going to be a "silver bullet", and it CAN be defeated. The best we can hope for - is to get back into better-shield-withstands-older-sword game, with a _temporary_ edge over attackers; still, in real-world it IS a Big Deal(tm) (nobody in a sane mind argues that we shouldn't make better bulletproof vests because a perfect bulletproof vest is impossible; still, this line of argument is used to say "hey, we cannot make a perfect anti-bot protection, so we shouldn't even try").