Outline for Chapter on Bot Fighting and Anti Reverse Engineering

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

[rabbit_ddmog vol=”8″ chap=”Outline of Chapter 29 from “beta” Volume VIII”]

As I am speaking on CPPCON2017 on Friday, I didn’t have time to prepare the next chapter of “Development and Deployment of Multiplayer Online Games”. Still, I have a fairly interesting piece of information to share; it is a planned outline for one of the most controversial chapters in the whole 9 volumes, a chapter on Bot Fighting and Anti Reverse Engineering.

Some time ago, in one of my previous posts, I have wrote a little bit about obfuscation and anti-reverse engineering – and was criticised for just scratching the surface of this admittedly huge subject. Well, it wasn’t the end of my story about reverse engineering and avoiding it. Here is just an outline for an upcoming Chapter 29 – I hope it is comprehensive enough this time <wink />:

Bot Fighting

Sometimes even a hare sets a trap for a wolf.

— L. Solovyov, “Disturber of the Peace, or Hodja Nasreddin in Bokhara” —

Landscape
- Why bot fighting is important for MOGs (refer to Vol. I)
  - NOT because of piracy (where the case can indeed be lost). “Bear in mind that software protection is not economically efficient” [Hacker Disassembling Uncovered by Kris Kaspersky, p.117]
  - “We as MOG developers have obligations to protect non-cheating players(!)We as MOG developers have obligations to protect non-cheating players(!)
  - “Don’t write abusable games“ non-argument. Intersection of Turing Test with games => empty set in practice (FWIW, chess and Go are abusable).
  - “It is hopeless” misperception -> actually, we just fell behind in the arms race (more on it below).
  - It applies ONLY to MOGs – where we DO have Authoritative Server. Helps a LOT (in particular – we can disable old versions; even with Apple Store, we can get away with keeping at-most-2-3-months-back).
- Actions
  - Ban
  - Red Flag (example actions on red flags: manual review, enable captcha, start running deterministic consistency tests, etc.)
  - General: DON’T REVEAL INFO (in particular, how you got him). => Ban Waves
- Legal
  - To ban, we need to have right to ban. Big Company example (cannot ban). Another Big Company example (can enforce). Refer to Legal Team. ToC. Burden of Proof (!).
  - TODO: [Carpenter], [Mansoor]
  - TODO: DMCA
  - [Hogle]
  - [Riddle]
- Most Popular Attacks
  - In 2001, it was enough to have checksums to cause confusion [Dodd]. Now we’re behind, but see below…
  - “Our Adversary: guy armed with IDA Pro and debuggerOur Adversary: guy armed with IDA Pro and debugger (IDA, OllyDbg, WinDbg, Cheat Engine).
    - We don’t care about AV guys, or government guys… (we still care about AV false positives <sad-face />)
  - Most popular attack vectors:
    - Protocols: openssl.dll (same applies to Win32 API TLS calls – TODO).
      - Any other DLL for that matter.
    - Data: Windows WM_GETTEXT.
    - Texture replacement (to make walls transparent, to make opponent more visible, etc.)
    - Code: IDA
      - For both “encrypted” and really-encrypted code: dumping+IDA or custom_loader+IDA
      - FLIRT(!)
    - Debugging+patching: disabling protection (naïve video on [mausy131] )
    - Debugging: Backtracing stack (cannot really hide system calls -> finding who has called them)
    - Debugging: Data analysis: RTTI, VMT pointers [SabanalYason]
    - Protocol-level: Self-MITM
    - Hooks (see, for example, [Aiko])
What do we want to achieve
- Security-by-obscurity
  - “Lost cause” argument (“Then finally, there is that question of code privacy. This is a lost cause.”).
    - Funny: both academic and hacker communities sing it in unison. Not funny: companies using it to say “hey, technically it doesn’t work, so the only way to handle it is legal -> pay us” (ref to GDC talk).
  - As for “protecting IP”, the cause might indeed be lost, but with MOGs and updates in mind – it can be recovered.
    - Weak point of the “given time, everything can be broken” argument: “given time”… B&M example: given time, any lock can be broken (but limiting available time to “until police comes“ does help quite a bit).
  - Strategic Goals
    - Increasing attack costs
      - Every bit counts! (running faster than bear, cumulative effects; refer to Vol. I)
    - Multi-layer helps (multiple skill sets, discourage).
    - “Economy of hack: aiming costs on our side to become << costs on the hacker’s sideEconomy of hack: aiming costs on our side to become << costs on the hacker’s side
      - Changing binaries cheap-and-often – attacking “given time” clause
    - Complicate collaboration between different hackers/teams
      - Again, changing binaries cheap-and-often
    - Overall – having to break 1000 different obfuscations every 2 weeks is an insurmountable task. In theory, tools can be developed to deal with it – but as of now, such tools don’t exist (not even close); when it happens – another round of the arms race. => We’re back in the game!
    - NOT being identified as malicious etc. (real: Sony scandal, perception: “false positive” by some antivirus)
      - It is heavy system-level stuff which tends to cause trouble. As long as we’re staying within “pure logic” – we’re perfectly fine.
      - DON’T blindly copy malware-originated techniques (most egregious example is trying to kill AV process)
Protection Philosophy
- Have to use techniques typical both for malicious programs (anti-reverse-engineering, polymorphism) and anti-malicious (antivirus-like: signature detection, suspicious hook detection, rootkit detection etc.) ones
  - Beware of AV false positives
- Weakest-link principle.
  - Scripting languages on the Client-Side <sad-face />.
    - DON’T confuse with a non-standard byte-code (which can easily be a protection).
- Tactical Goals
  - Difficult to reverse-engineer and disable
    - No single-point-of-attack
  - Both code protection and data protection
  - Keeping our source code readable
    - “ABSOLUTELY NOT “Obfuscated C Contest”ABSOLUTELY NOT “Obfuscated C Contest”
    - It is binary which needs to be obfuscated, not source.
- Principles
  - DON’T reveal info (beyond absolutely necessary)
    - DON’T enable introspection
  - DO produce Monolith
    - DON’T help to split it (not using DLLs follows out of it, and also inlining and force-inlining)
  - Protection MUST spread over our executable. One single point of protection doesn’t fly (too easy to find and disable). IsDebuggerPresent() API example.
  - Frequent completely automated code changes (to “change often and cheap”)
  - Kinda-polymorphic executables
    - “Compile-time polymorphism”. Very different executables even with zero source changes.
    - Goals: anti-FLIRT, knowledge from hacking previous version thrown away, much more difficult collaboration between hacking teams.
    - Danger of differential analysis (mostly theoretical now)
    - Real (runtime) polymorphism: unclear (where modifications should come from? If from binary-level algorithmic changes – it is ok, but effort/reward is unclear. If from a pre-existing library of equivalent pieces of code – has potential to reduce protection – however, if Server-driven with real encryption and key coming from Server – it might fly).
  - Polymorphic data (~=”the same data looks differently on each run”)
    - At least in theory, can extend to “looks differently after each access” (see more on obfuscating classes below).
  - 3^rd-party protection
    - in theory – any of the below, in practice, most of the time – merely code “encryption”
      - actually – code scrambling, and with single-point-of-attack
      - Also – raise in entropy can lead to false positives by AV [Balci]
    - con: juicier target (unless exclusive for you). “class breaks” [Schneier], aggravated with “as it is security by obscurity, attack is possible by definition”. Examples: ScyllaHide/TitanHide (see below).
    - con: usually, source-level information is not used.
    - “bot fighting is one case when DIY obviously rulezzzbot fighting is one case when DIY obviously rulezzz. “Do not employ ready-to-use retail protection packages” [Hacker Disassembling Uncovered by Kris Kaspersky, p.117]
    - Under “every bit counts” principle, 3^rd-party stuff might still be good to use as a complement to DIY source-level obfuscation, provided that we (a) understand what they’re doing, and (b) use those systems which indeed provide additional value. Examples:
      - system-level protection (being-debugged etc., see below)
        
        integrity detection/protection
      - binary-level obfuscation (including “encryption”)
    - Special Case: VMProtect (there are also sources claiming Themida is doing the same thing – but it is unclear whether it really does TODO: check)
      - Usual binary-level obfuscation is still inferior to the source-level one (less information available)
      - However, virtualizing whole CPU is Damn Interesting™ (and hated by hackers too [Antivirus Hacker’s Handbook]). Expected to be especially efficient if we combine it with our source-level obfuscation discussed below (different obfuscation – code vs data, different information used to obfuscate). Chances of being flagged by false-positive AV: unclear at the moment.
      - As such, unlikely to work for emscripten – but in general, the “own VM with its own randomized instructions” approach should be generalizable there.
DIY Protection
- Where does a wise man hide a leaf? In the forest.
  But what does he do if there is no forest? He grows a forest to hide it in.
  — G.K. Chesterton, The Innocence of Father Brown —
- Don’t feed the hacker (MUST do before everything else)
  - DON’T release Linux/x86 or Linux/x64 version (where the best toolset for hacking exists, and which is usually not interesting monetization-wise). I love Linux but objectively – running there increases chances for being attacked <sad-face />.
  - No DLLs (and no exported symbols)
  - Anti-FLIRT: Limit the use of standard libraries as much as possible
    - At the very least – recompile them with your own compiler settings; much better – run it through src-2-src compiler (see below) ensuring compile-time polymorphism.
    - Template libraries are usually ok – but it is still better to add force-inlining here and there.
  - No Sensitive Text as Windows Controls
    - No even DrawText etc.
      - => owner-draw everything (and not only on Windows)
    - “Encrypt your traffic (otherwise – self-MITM attack)Encrypt your traffic (otherwise – self-MITM attack)
    - Minimize OS calls (can’t be really hidden => give away information). What do we really need? Usually – only (a) inputs, (b) graphics, (c) network, (d) file reads/writes, that’s it(!). Anything else is just feeding the hacker (and now check dependencies of your .EXE).
- Bot/modification Detection
  - What we need to deal with? If grinding bots -> may want to Captcha
  - Client-Side detection
    - System-level, see below
    - Honey pots, see below
- Server-Side Statistics detection (TODO: elaborate!)
- Player Complaints -> tools 4 security team
- System-level protection
  - Using some-information-which-is-external-to-code (mostly system calls)
  - For our purposes – Windows & emscripten (very few system-level protections applicable for emscripten <sad-face />)
  - Time-Based Protections (delays) – one of the very few somewhat-cross-platform protections
    - Aims:
      - Detecting debugger; nonblocking_code<>, using obfuscation techniques for (end-start)>MAX_DELAY. Account for context switches! For RDTSC – also check for normally-impossible (end-start)==0.
      - potential to detect VMs
    - “GetTickCount(), also via SharedUserData (0x7FFE000)Tricks: RDTSC, QueryPerformanceTimer(), GetTickCount(), the latter also via SharedUserData (0x7FFE000); coherency between the two (otherwise – kernel debugger or Scylla present). TODO: timeGetTime()
      - measure relative speeds “normally” and around exceptions [Falliere]
      - RDTSC: combine with function integrity checks.
      - Obfuscating RDTSC [pedram] – really weird and risky, but is still interesting…
      - emscripten_get_now(), EM_ASM and calling JS performance.now or Date().getTime() directly. NB: danger of JS being interrupted when hidden – and iOS seems to exhibit this behavior; on desktops it does seem to work now, but see [Antony@StackOverflow]
    - Avoid direct checks – just use result for obfuscator instead… (such as adding/xoring (end-start)/MAX_DELAY, varying MAX_DELAY a bit to add variability).
  - Executable integrity (checksum etc.)
    - Function-level checksum [Kulchytskyy] TODO: calculation of checksums
    - Avoid direct checks – mix into obfuscation instead…
    - Multi-level checksums (Skype, [BiondiDesclaux])
    - TODO: emscripten
  - Debugger Detection/Prevention
    - [Ferrie], some source in [LordNoteworthy@github]. Most common/interesting ones:
    - IsDebuggerPresent(), CheckRemoteDebuggerPresent() etc. (quite silly, mostly as a kinda-decoy)
      - OS calls are not 100% obfuscatable => using them (unless they’re actually inlines or macros) is a Bad Idea™ (Bad Example: [zer0fl4g@github]). IF using them – obfuscate system calls and literals (such as obfuscating “OllyDbg” for FindWindow(), and obfuscating “FindWindow” for GetProcAddress()); more on obfuscating system calls below
    - Not-so-obvious system calls, such as OpenProcess(“csrss.exe”), OutputDebugString(), UnhandledExceptionFilter()
    - FindWindow() (silly, but…)
    - Memory reads. NtGlobalFlag, heap flags, KdDebuggerEnabled, GetLastError() (cmp fs:[ebp+34h], ebp, cmp gs:[rbp+68h], ebp), TODO – anything else? Reading from RAM without function call(!).
      - DON’T use directly for comparisons; instead – use as a part of data obfuscation (in particular, will look similar to ‘global read of known value’ used to prevent optimizing out). Effective partial compares when using for data obfuscation (using &mask1 in one place, |mask2 in another place).
      - More devious: use the value to generate decryption key, then try to decrypt several pieces of code (with one decrypted by “correct” key, and another by “being-debugged” key, other combinations of “being-debugged” flags also can be accounted for). Then use this code to communicate to the server – which now can distinguish clients which are being debugged (gotcha!).
      - “Even more devious: use the value to generate encryption key, which is used to encrypt a well-defined constant, which is sent to the server – which then can try different keys to decrypt (gotcha!)Even more devious: use the value to generate encryption key, which is used to encrypt a well-defined constant, which is sent to the server – which then can try different keys to decrypt (gotcha!)
    - “self-debug” (actually – debug a copy of the process). Only one ring 3 debugger allowed at least in Windows.
    - Hiding thread from debugger: NtSetInformationThread, NtCreateThreadEx (reportedly used by Steam at least at some point)
    - MOV SS
    - INT 2D
    - “check within TLS callback” trick
    - NB: using Zw* counterparts [TODO – elaborate]
    - Messing with debuggers:
      - BlockInput; not really detection, but…
      - REP <some-op>
    - [Kulchytskyy] Most interesting techniques (beyond [Ferrie])
      - NtCreateThreadEx to hide threads
      - Asm to set SEH handlers (32-bit only); on table-based SEH in x64 Windows – see [NTInsider]
      - KiUserExceptionDispatcher
    - [Falliere].Techniques going beyond previous refs:
      - PUSH SS/POP SS (actually, it is described in [Ferrie], but IMO explanation here is better)
      - ICE breakpoint (0xF1); not to be confused with SoftICE.
      - Scanning for INT 3 (0xCC). False positives. Also should scan for 0xFA [Falliere] and probably others. Checksums are generally preferred.
    - [OpenRCE] Techniques going beyond previous refs:
      - LOCK CMPXCHG8B as an invalid instruction to raise SEH
      - Lots of debugger-specific trickery
    - [Tully]. Techniques going beyond previous refs:
      - Removing PE Header
      - Messing with debuggers:
        
        OutputDebugString Exploit for OllyDbg (TODO: is it still up-to-date?)
    - SoftICE detection (doesn’t make much sense now, esp. if your program is 64-bit, but some ideas might be applicable to other debuggers): lots of discussion in [Crackproof your software]
    - – May be there’s still hope?
      - Nope
      — Garfield the Cat —
      ScyllaHide [nihilus@github], TitanHide [mrexodia@github] Very good examples of “class break”. Still MIGHT want to do anti-debug but don’t overplay it (and more importantly – don’t overrely on it). Candidates to bypass: reading TickCount directly from RAM, RDTSC [[TODO: something else?]]
  - Obfuscating system calls:
    - Not really possible, but we should still try as much as we can (beware of triggering AVs though).
      - AV false positives: selective obfuscation (such as Internet/sockets), see on false positives below
    - LoadLibrary()+GetProcAddress() with all the literals being obfuscated(!).
    - At least – should obfuscate everything (except for LoadLibrary() and GetProcAddress()), at the very least – obfuscate all the socket calls (traditionally, they’re one of the very primary targets for MOTs) TODO: analysis for using MSVCRT’s socket wrappers over WS2_32.
    - See [Wardman] – still imperfect, but not that obvious (especially if combined with obfuscating literals etc.)…
    - Using custom DLL loader (such as [pasztorpisti@codeproject]) for system DLLs – unlikely <sad-face /> (still, might be worth trying). Any ideas on improving the loader are very appreciated
    - emscripten: EM_ASM(eval()), self-modifying code [Elliot]
  - Avoiding emulators – see Chapter 8 in [Antivirus Hacker’s Handbook]
    - the main idea is to run in a sandbox and see any changes in the environment. [F-Secure]: “When execution stops, the sandbox is analysed for changes.”. MAY be misused by attackers…
    - core count [Balci] – questionable (can cause false positives)
    - not that big deal (we are NOT malicious, so the sandbox won’t change(!))
  - “encryption” (actually, scrambling):
    - probably the best one: HARES [Torrey]. No relation to ‘IT Hare’ ;-).
      - Hypervisor (thin, but still <ouch! />)
      - TLB split (clever but x64-only…)
      - Still single-point-of-attack
    - Guard pages [Tully], simple on-demand code descrambling.
    - “Stolen bytes” [Tully] – not much effect, really.
  - DLL injection detection/prevention.
    - Process debugs itself (actually – a copy of itself)
    - TLS hook
    - Thread count <smile />.
    - TODO: dealing with “reflective DLL injection”
  - VM detection, Sandbox detection, WINE detection
    - [Cannell]
    - [Wójcik]
    - drivers
  - Hacking PE files
    - SizeOfImage [Tully]
    - messing with PE sections [MachinesCanThink]
    - Several things for Linux [Baines]
    - Beware: obvious and highly suspicious for AVs
  - Known-Bad-for-Us Processes
    - Legality claims – IMO unlikely, but check with legal
    - Antivirus-like analysis/techniques
  - Implementation: In-process -> separate process (bad) -> service -> driver (questionable) -> rootkit (Big No-No, refer to “Sony rootkit scandal”)
    - Usually – Single Point of Attack
    - Obfuscate Communications (see below)
- Data obfuscation basics
  - Only theory for the time being (practicalities will be discussed later)
  - Obfuscation: DEOBF(OBF(X)) === X
  - Obfuscation Generators (!) – same as “Pocket Generators” from [Hare.1] – [Hare.3]
    - Automatically generated from true random number on each build(!!)
    - “Obfuscation Generators cover the spectrum from simple XOR to “White-Box Encryption”Generalization of “White-Box Encryption” [Whiteboxcrypto]; cover the spectrum from XOR to “White-Box Encryption”
      - Advantages of Generators: (a) can be made much more lightweight and inlinable than “White-Box Encryption” (!), (b) allows for MUCH wider variety of algorithms (actually any bijection will do), (c) increase space for potential algos multiple-fold, (d) => cannot be found via techniques such as [XuEtAl], [LestringantEtAl], and [Calvet]
        
        In many cases can avoid “crypto loop” entirely (!).
  - Primitives, and more primitives:
    - Bijections
    - Injections (increase size but are still reversible). No relation to “DLL injection”
    - Examples:
      - XOR, ADD mod 2^n
        
        XOR MAY be undesirable due to false AV positives [RaabeBallenthin]
      - Bit-wise rotations
      - Bit-wise shift followed by XOR-with-original
      - Permutations (swapping bytes or even bits); AES permutations as one example.
      - *c mod 2^n where c is odd
      - Galois arithmetic (a.k.a. ”finite field arithmetic”)
        
        Both addition and multiplication are reversible
        
        GF(p), where p>2^n. First operation effectively converts from 2^n space into p space, so it is injection – increases size! In exchange, reverse operation provides kinda-checksum (which can be used to detect strange things happening)
        
        “GF(p) where p is a pseudo+Mersenne numberWhere p is pseudo+Mersenne (speeds things up; see [IvanchykhinEtAl] for details)
        
        GF(2^n) field. Bijections all the way. Multiplication (example: S-box; table-based reverse).
        
        Two representations: ‘normal’ and ‘polynomial’. Not sure whether a reasonably fast conversion exists between the two <sad-face /> (I should get out of my depths somewhere <wink />)
        
        GF(p^n) field, where p!=2 (example: GF(3^n))
      - PRNGs:
        
        Mostly for streams (for ints etc. – see Fenstel below).
        
        Salt!
        
        From LCG and LFSR (better to make sure to generate constants to ensure maximum period) to CPRNG. Can use one-way functions within.
        
        One-way: can use floating-point math for intra-executable obfuscation
        
        Determinism required: beware runtime floating-point settings (see [Dawson] for details). Relief: for intra-Client, we’re speaking about “same-executable determinism”
        
        Mix with integer ops
        
        Do use the-same-stuff-you’re-using-for-3D graphics (sin(), cos() – the same versions as you use for 3D!)
        
        SSE
        
        LFSR and pseudo-block-cipher-in-CTR-mode – O(1) calculation of arbitrary point in stream.
        
        Mixing function: not only XOR (actually – most of obfuscation primitives)
        
        Approximate float-point calculations
        
        “Smooth” functions (actually – limited 1^st derivative). Example – sin(). Avoid too-low or too-high 1^st derivative values (good example – (-1,1) range for sin()). Potentially – uneven distribution of inputs to compensate.
        
        Sin() – similarities to CORDIC [Bertrand]
        
        Function-specific obfuscation (example for sin(): add alpha on OBF, add 2*pi-alpha on DEOBF; example2 for sin(): add alpha on OBF, add pi-alpha, and then negate on DEOBF).
        
        Compound (examples: sin(sh(x)), sin(sin(x)))
        
        Using different libs for OBF/DEOBF (such as your-usual-library on OBF, Taylor approximation-or- on DEOBF).
        
        Approximations: Taylor, Chebyshev, polynomial [Chou], [zfedoran@stackoverflow]
        
        Specific sin() approximations: Bhaskara I, CORDIC, integer [porgarmingduod@stackoverflow]
        
        Conversion into approximated 3D space, using reversible operations in that space such as movements, rotations etc. (rotations, in turn, will use matrices&quaternions). Fun stuff, but it is difficult to control reversibility when multiple approximate operations are involved <sad-face />.
        
        Crypto-primitives:
        
        “Fenstel round: allows to use one-way functions Fenstel round: allows to use one-way functions (somewhat-similar to PRNG, but more suitable for fixed-size data).
        
        S-Box (actually – a specific case of GF(2^n))
        
        RSA-like encryption+decryption (we can use MUCH smaller sizes, like 64-bit or even 16-bit <wink />)
        
        Discrete logarithm (like using half of Diffie-Hellman key exchange with g^a mod p being pre-shared). Can use MUCH smaller sizes, like 64-bit
        
        TODO: more
        
        Any crypto (including broken crypto such as TEA and reduced-rounds crypto)
        
        LOTS of other stuff should exist too <smile />
      - Making reverse code difficult-to-find
        
        Using the same constant in several unrelated places. Using system-wide constants (such as 0x7FFE000) instead of random ones liberally.
        
        Trivial example: composite additions/multiplications (beware of smart compiler optimizing it out! – see below)
        
        Less-trivial-example: ADD constant as 3 smaller ADDs combined via bitshifts etc. (if elaborated enough – very little risk of compiler optimizing it out). Generalization – any multiple-precision calc from Knuth Vol. 2 (and not necessarily along word lines(!)).
        
        Relying on trivial math such as ~x === -x-1, x >> 4 === x / 16, x / 5 === x * 0xCCCCCCCCCCCCCCCD (as long as x is uint64_t) and so on. Beware: modern compilers usually know it too and optimize less-optimal-one (but if we throw in volatile or external var – things will change).
        
        Using floats to calculate integer stuff (asm.js-style)
        
        Integer SSE
        
        Using alternative representations:
        
        Built-in types or boost::multiprecision (beware: the latter can happen to be sssslllooowww compared to math-based alternatives such as Montgomery or pseudo+Mersenne)
        
        Montgomery representation for GF(p). (both modular addition and multiplication are ok in Montgomery)
        
        Modular representation. Use Chinese remainder theorem to produce “modular representation” as a pair of remainders given co-primes A and B, where A*B>2^n (injection!). Reverse conversion: use extended Euclidean algorithm to get Bezout coefficients and restore the value back. Addition/multiplication in “modular representation” are trivial (see Knuth Vol. 2 on Modular Arithmetic).
        
        For GF(p), GF(p^n) – using redundant “partially normalized” representations [IvanchykhinEtAl] – especially good for accumulations.
        
        For GF(p) where p is pseudo+Mersenne – using special properties of pseudo+Mersenne
        
        Different ways to calculate exponent: linear, by squaring, using Montgomery.
        
        All constants can be further obfuscated too (beware of compilers optimizing obfuscation out(!))
        
        Universal difficult-to-reverse over 8 bits or so: some math one way, table another way.
        
        multithreaded obfuscation (still deterministic(!))
        
        busy loops (don’t forget about memory barriers(!) – or just wait long enough <wink />)
        
        nice interaction with avoiding optimizing out – volatile
        
        end not by flag, but by some checksum within result being valid (128 bit of a good checksum is enough)
        
        not so busy loops (sleep(), SwitchToThread(), etc.)
        
        Win32 Events, C++ cond_vars
        
        Even more other stuff should exist too <smile />
      - Generated obfuscation should NOT be a part of the regular source code tree(!) (important to avoid leaks-of-everything-at-once)
      - Mixing in external data (such as “being debugged” in memory, which is supposed to be always-0) – plays well with external vars <smile />
      - for non-MOGs: in theory, can be used to identify leaks (generated stuff is a “fingerprint” for the Client)
- Obfuscating at source code level
  - Difference between programming languages
    - “Obfuscation-wise, C++ rulezzzzTable from Vol. II’s chapter on Client-Side Architecture. Obfuscation-wise, C++ rulezzzz.
    - We’ll discuss C++, though some of the findings will apply across the board
  - C++ source-level protection (this is what we’ll actually speak about <wink />)
    - Applicable to any C++, including emscripten (a bit less protected, but still…)
    - JS/Java-style obfuscation is useless for C/C++
      - Except for class names in RTTI, but we should disable RTTI anyway
    - Remove Dbg info
    - Compiler options (high optimization levels, maybe unroll loops)
    - On avoiding to optimize out (for some of obfuscations, especially literal obfuscations, to work):
      - volatile (see [Epp@stackoverflow] , [Keil], and [Regehr] ); seems to apply to Java but not to other languages
      - ‘xor with a global non-static var which happens to be always-zero’ (also seems to apply to Java/JS/…).
      - system call with a predefined result (like reading some bytes from your own file which never changes); also applies to Java/JS/…
      - | (x+1==x) and other impossible math (not really reliable in the long run, especially for obfuscating literals)
      - Asm:
        
        Empty asm [jrmymllr]
        
        Generated asm
        
        “normal” GCC-style-with-constraints asm with intermediate registers etc.
        
        DO allow compiler to choose registers <evil-grin />
        
        does it work for Clang/Win?
        
        “Native code permutations” [Tully]
        
        Weird asm, such as “stack machine working in reverse polish notation”.
        
        Metamorphism [Strehovsky] can be used at this level.
        
        NB: using exotic instructions (such as x64 BCD operations: DAS/AAS/…) is controversial (as they’re not used anywhere else – they shout “I’m obfuscated”, so probably better to avoid). _rotl etc. is less exotic, but its benefits for obfuscation are still not really obvious.
        
        Push ss/pop ss (see above)
        
        REP <some op> (see above, via asm __emit)
        
        compiler-dependent: GCC attributes+pragmas (TODO, NOT recommended)
    - Obfuscating Data
      - “On importance of obfuscating dataOn importance of obfuscating data
        
        Example: iterating 1-100 directly, and in A*C+B mod 2^32 space
        
        Disrupt obvious relations; “white noise” data
        
        Make information “hey, 1^st byte in message is its type” useless.
      - Obfuscating variables of basic types:
        
        Integers, strings, pointers(!)
        
        obf<> template, OBF() macro (the latter – ONLY for strictly-intra-process stuff).
        
        Handling ints without size spec: [kennytm@stackoverflow]
        
        “Honey pots”
        
        Can be done in templates ([NevesAraujo], but use our much more generic obfuscation generators)
        
        Still prefer codegen: add-only
        
        Further improvement: obfuscate<type, performance_effect, obfuscate_ID>::read()<instance_ID>; instance_ID causes equivalent-but-different-at-asm-level reads.
        
        Representations enabling “shortcut” operations (without reverting back)
        
        Examples: A*X+B mod 2^32, variable-radix no ‘*’, non-standard-radix
        
        Examples for obfuscating timing protection such as (end-start)>MAX_DELAY || (end-start)==0.
        
        Depending on the nature of operations
        
        Implementation is usually more difficult – need to know operations for specific instance (Clang).
        
        Example: ++ by adding an odd constant mod 2^n, == by direct comparison with pre-calculated constant. Barely readable at almost-zero cost.
        
        Even less readable: ++ as multiplication in GF(3^n), == by direct comparison with pre-calculated constant. No chance in hell to find out in advance when the loop will stop.
    - Replace RTTI => virtual functions, disable RTTI in options
    - Obfuscating heap: ASLR (if you do your own (Re)Actor allocator – randomize yourself; keeping determinism in this case)
    - Obfuscating call stack
      - Trampolines (elaborated ones such as those in [Newger])
        
        No exceptions over trampolines
        
        Using nanomites (INT3-based jumps [Tully]) for function calls
      - keeping stacks as shallow as possible
        
        (Re)Actors
        
        More (Re)Actors; Logic (Re)Actor->Protocol-Level Obfuscator (Re)Actor ->TLS (Re)Actor ->post-TLS obfuscation (Re)Actor ->Socket (Re)Actor. Good luck tracing it under time constraints (especially when obfuscated queues are involved – see below).
        
        Using obfuscated<int,1234> etc. as call parameters – helps quite a bit.
    - Obfuscate objects(!)
      - TODO: describe
      - “obfuscation of VMT pointerIncluding obfuscation of VMT pointer (preventing related attacks from [SabanalYason])
      - If PRNG-based – salt (to avoid identical VMT pointers being identical after OBF); pointer as salt (implies “have to reobf on move”).
      - Can avoid salt if using block-cipher-like constructs at least for first 16 (better 32) bytes.
      - Partial deobf for field access (using LFSR O(1) or equivalent). LCG (pre-calculated coeffs(!))
      - For a good measure: make sure to throw in a different OBF of p itself.
      - Whether it is worth the trouble – depends, but is a valid technique.
    - Randomized data obfuscation (different representations of the same data on each run)
      - sync between ends (both for values and for messages)
      - encoded into message
    - Hiding leaf in the forest
      - Fake VMT pointers. TODO: best-way-to-copy
      - Fake RTTI (simulating [SabanalYason]).
      - Using generated asm with registers such as ecx/eax widely.
  - Obfuscating code
    - Obfuscating literals: integers, pointers(!), strings(!!)
      - Have to do much better than [Haephrati] (key-based, single enc/dec functions easy to locate and disable)
      - Codegen: TODO: check if is it possible to avoid modifying codegen for string literals
      - Beware: literals have MUCH higher risk of being optimized out than variables. In general, have to add some non-guaranteed-to-be-constant stuff to be sure…
      - Kinda-steganography (lowering entropy, but not only)
        
        hiding within existing images, example: [xifeng27@github]
        
        hiding within existing text: [Ubuntu], DON’T use “as is” (even with password)
        
        adding pointless html (<div> etc.). NOT “CSS-based hiding” such as invisible text, navy-on-black, etc.!
        
        hiding within generated text (redundancy, injection). [Salomon]
        
        Funny stuff: pulling silly text from a facebook etc. page. Not really usable “as is” <sad-face />
        
        wierd: code redundancy, example: [El-Khalil]
    - Force-inlining
      - “Force-inlining as a prerequisite for serious obfuscationPrerequisite for serious obfuscation
      - Discussion on out-of-order optimizations => interleaving => “shaking” signatures etc.
        
        What if we have 50% of the code obfuscated? 80%?
        Proverbial “needle in a haystack”. Will it break currently-existing signature-based techniques completely? There exists amount of white noise which makes signal extraction impossible.
      - Force-inline obfuscators!!
      - Force-inline (custom) allocations and constructors/destructors! (preventing search for constructors/destructors from [SabanalYason])
      - Side bonus: defeat “There are many code cross-references to a function” heuristic from [RaabeBallenthin]
    - Compile-time polymorphism
      - Achieved automagically by using LOTS of random force-inlined obfuscators
        
        Explain optimization mechanics.
        
        Other stuff (see below)
    - Changing default calling convention (__thiscall for MSVC) into __fastcall, __cdecl, or __stdcall. Helps against ECX-reliant techniques such as those in [SabanalYason]. Don’t know to change it globally (/Gd,/Gr, and /Gz won’t work for member functions), but specifying __fastcall etc. for the function itself is supposed to work. RANDOMCALL macro modifier.
    - Kinda-VMs (example: P-Code from [Newger])
    - asm, such as [Lyashko]. NB: I’d rather have it added (and randomized!) by a Src-2-Src compiler.
  - Obfuscating messages
    - Intra-Client only for the time being
    - Special attention: MsgIDs
    - Obfuscating streams (can be made stateful).
    - Instance_ID still applies
  - Massive Source-Level Obfuscation
    - Looks as Declarative-ONLY at programmer level (=”no obfuscation code in app-level code, only ‘please-obfuscate’ declarations”)
    - Variables/literals/messages/… as described above
    - Literally hundreds of places in code
    - Automated obfuscator re-generation for each build
    - Benefits large programs the most (making this technique not that useful for malicious programs)
  - Obfuscating Code+Data (data via removing VMT pointers): templates+inlines(!).
    - “replacing virtual functions with a template.replacing virtual functions with a template. Two cases for dynamic dispatch: (a) real collections of polymorphic objects (real dynamic dispatch, cannot easily replace with a template); (b) type is known, virtual function used as abstraction (static dispatch is enough, can be replaced with template). Example: Reactor::react().
    - Obfuscating communication between different modules
      - IDL
      - Obfuscated queues (arbitrary switch to another queue <he-he />). Don’t forget to force-inline queue functions and to obfuscate pointers within(!!).
      - Obfuscation can be applied both to function calls, or (even better) messages (works very well with (Re)Actors).
      - Real key exchange (RSA, Diffie-Hellman – but can be MUCH smaller than usual, such as 64-bit).
    - Self-modifying code:
      - Hurts static analysis
      - Change constants on the fly
        
        C++ (function pointer->find constant->replace)
        
        JS/emscripten [Elliot]
        
        emscripten: can use both
      - Change code on the fly
        
        C++: have to know what-it-is-likely-to-compile-into (or to use asm in source code). Example with embedding RDTSC check.
      - Beware: conflicts with integrity checks
      - Beware: page protection, anti-virus false positives.
  - Generalization to other languages
    - TBH – obfuscation-wise C++ rulezz… (whether C is better, is an open question; if C++ avoids RTTI and VMT pointers, it IS better – due to XYZ)
    - TODO: IL2CPP
    - Variable/class names (usual source-level obfuscation)
    - Along the same lines as C++: literals (TODO:check details), variables (TODO:check details), messages, (Re)Actors
      - Avoiding optimizing out can be limited (ref above)
    - Force-Inlines, templates-which-clone-code: by Src-2-Src compiler <ouch! />
    - TODO
  - Src-2-Src compile:
    - Additional Obfuscations:
      - Compile-time polymorphism:
        
        random inlines/force-inlines
        
        Shuffling functions/data definitions, struct/class fields, order of files in make
        
        Adding junk code [CodeMorph], [StarForce] – not too useful (randomized data obfuscation is much better).
        
        Function substitution [Wójcik]
        
        Randomizing asm-level patterns (along the lines of [Lyashko] and [Bremer])
        
        “Table interpretation” [Eilam], chapter 10
      - Trampolines [Newger]
      - Nanomites [Tully], example in [drew77@rohitab]
      - TODO: more?
    - “Implementing src-2-src compiler based on ClangImplementation: using clang to modify code: public example of doing some modifications [ariel19@github] (use ONLY as an example of “how to use clang to modify source code”). TODO: more examples
  - Beyond source code
    - Binary code obfuscation (see, for example, [yellowbyte@github])
      - Metamorphism
      - “native code permutations” [Tully]
      - Implementation:
        
        binary post-processing (latter – available as 3^rd-party TODO:find/list; Armadillo is interesting because of interplay with source code [Kotik]);
        
        Using LLVM to modify LLVM bytecode [Merlini]
    - Custom compiler
      - Holy grail, but huuuuge headache.
        
        Even with custom compiler, source-code-level stuff is necessary.
      - Custom/obfuscated stack frames
      - Custom/obfuscated calling conventions (dealing with ecx/eax conventions and related analysis from [SabanalYason])
      - Custom/obfuscated VMT pointers
      - Anti-disassembling techniques such as jump within instruction (see also p.144 in [Antivirus Hacker’s Handbook])
      - MAY also include all the source-level obfuscation generation (though it still needs to be marked in source to understand performance implications)
      - Implementation: Clang (now also works for MSVC)
  - Obfuscating distributed messages/protocols. Versioning
    - Intra-Client – trivial to re-generate each time
    - Server-2-Server – not about bot fighting, but does help with security; not too difficult as long as we restart everything at once (patches are ok via re-using obfuscators); if necessary versioning is possible and simple.
    - Client-2-Server – a Major Headache, but Extremely Useful
      - MUST-have: blocking older versions(!!)
        
        Simplistic “check protocol version” won’t fly (single point of attack) => have to spread it over the code.
      - IDL-Based Versioning
      - Single-Client-Multiple-Servers
      - Backward Compatibility
        
        Format-Converting Front-End Servers (code generated by versioned IDL; ability to turn off quickly if necessary; more quarantine etc.)
  - Dynamic protection
    - Area of active research (=”your guess is as good as mine”)
    - Server-Side online hacking detection (time-based)
      - Keys from Server-Side to decrypt certain logic (can be decrypting wrong logic <evil-grin />)
        
        Implementing decryption on Windows
        
        [Lyashko2]
        
        Linker-map based
        
        Our-own-DLL+custom loader (DON’T create DLL in file system – load it directly instead)
        
        DO “sign” DLLs (actually – still obfuscation, as subject to self-MITM, but a rather good one)
        
        Own loader: [pasztorpisti@codeproject] can improve things greatly (but beware of false AV positives)
        
        obfuscating DLL calls (see, for example, [Wardman])
        
        Loading real code (see above re. kinda-DLL). Example: critical payment stuff for free accounts.
        
        Whole (Re)Actors with different obfuscation
        
        1000 different versions of obfuscation loaded on-demand (with check on server-side that client matches whatever-version-he-was-given). <very-evil-grin />
    - Server-controlled honey pots
      - Normally – valid non-obfuscated copy of the data (“accidentally left there” <grin />). When suspecting this player – a flag is sent to modify/remove it, so the bot will stop working (and you’ll be able to see that play has stopped, or that player started to play worse all of a sudden)
      - “if you feel really evil – you can make the honey pot data look plausible, so the bot will make things worse for the bot-aided playerif you feel really evil – you can make the data look plausible, so the bot will make things worse for the bot-aided player.
      - About the same is possible for protocols (messages carrying two copies of some vital data – one encrypted, and another unencrypted, normally the same but sometimes…).
    - Deterministic redundant information in Client->Server packets. Example: player clicks etc. (usual for Authoritative Server), plus (ID of network packets received, some piece of state), sufficient to reconstruct the state on the Server-Side relying on determinism.
      - Specialized (Re)Actor (or part of (Re)Actor) for this purpose (with obfuscation etc.)
        
        Make sure to mix in obfuscated messages and state fields
        
        Cascading information from previous (Re)actors.
        
        Acts as a “kinda-signature” for the stream
        
        Time MAY be excluded
      - Main (Re)Actor
        
        Cross-platform determinism is necessary; really ugly(!) as soon as floating-point is involved <sad-face />. Recording to run checks semi-manually on the same executable is still possible.
        
        Very limited floating-point: last-4-or-so-bits transferred for FP fields of the state (or more-or-less linear integer fields affected by FP calculations).
        
        Time (MAY have to roughen to milliseconds since previous one)
        
        All the unacknowledged event history is sent Client->Server until ack – to avoid re-sync
        
        Re-sync happens only when state is sent from the Server
        
        More devious: request desired piece-of-state from the Server-Side
        
        Checks MAY be selective (“red flag”) to save Server-Side resources; recording for semi-manual checks later.
        
        Extremely powerful (~=”ongoing integrity verification”); with proper obfuscation added – is likely to stand on its own for a while even without system-dependent stuff => enables emscripten (still, beware of floating-point)!
        
        Still, usually doesn’t cover from certain attacks such as texture replacement.
- What to obfuscate?
  - - “DON'T want to obfuscate at vertex level!Not at vertex level! But it is not that much of a problem in practice.
      - 3D model level is still attackable – but MUCH less performance-critical => can be obfuscated.
    - More generally – we have to obfuscate those 95% of code which take 5% of execution time.
    - Do we want to have detection/enforcement on Low-Risk-for-hacking games?
      - Disabling protection for such players/games.
      - Going further – use server-driven code decryption/loading to hide such code completely.
- Will it help?
  - If you want to hide your Client-Side intellectual property this way – probably not.
    - For MOGs with Authoritative Servers – most of IP is Server-Side anyway.
  - Economics: cost of maintaining obfuscated code on our side << costs of breaking every 2 weeks
    - “Reverse engineering is an “inverse problem”, and inverse problems are generally more difficult than forward onesReverse engineering is an “inverse problem” [TODO: wikiquote], and inverse problems are generally more difficult than forward ones
  - If we want to protect integrity of our game universe, the best we can hope for – is engaging into new-armor-resists-old-shell – new-shell-beats-old-armor battle (instead of giving up).
    - Armor-vs-Shell arms race. Do we really want to abandon all the armor just because no perfect armor exists?
  - Advice above will probably help, but only for some time.
    - Temporary competitive advantage is very likely (also “you don’t need to run faster than the bear”).
    - Both theoretical (such as differential analysis) and implementation flaws may exist => be ready to adjust.
- False positives by AV
  - Heuristic Engines. TODO: elaborate [Schmall], [Balci], [F-Secure]; list: [Balci]
    - Scary stuff (but doesn’t materialize in practice): anti-reverse-engineering===malicious [Antwerp]
    - Detecting decryption loop [ZhangEtAl], [Szor], [SabanalYason]
      - non-zeroing XOR [RaabeBallenthin]. Ok, with obfuscation we can use ADD <wink />.
      - Avoiding decryption loop: unrolling+recursion (won’t last but…)
    - Undocumented APIs (don’t use)
    - Guarded memory regions (jury is out on this one…)
    - Rumored: “suspicious” function calls (LoadLibrary(), GetProcAddress(), VirtualAllocEx()) – IMO cannot be too bad…
    - Internet/socket APIs,”potential IP address in memory” (MIGHT be better to obfuscate)
    - PE manipulations (SizeOfImage, entry point manipulations, section manipulations) [F-Secure] – to be avoided
    - Entropy increase
      - per-function code obfuscation
      - extreme case: kinda-steganography
    - Non-issues: killing antivirus processes, injections, elevating privileges, modifying proxy settings, sleeping for a looong while (just DON’T do it)
    - Potential conflicts with detection: querying process information (hide?), installing hooks [[TODO: more]
    - Quasi-issues: dropping executable files (MAY be needed for upgrading – but from what I’ve seen, AVs are smart enough to see the manifest)
  - TODO: IEEE Taggant (TODO: is it really honored by AVs?)
  - Grade our approaches [static code obfuscation, data obfuscation, and RDTSC is ‘low’, obfuscating pointers and obfuscating calls to OS is ‘medium’, code encryption and manipulation is ‘high’]. Exception: emscripten <smile />
  - Comparison Table for Different Techniques (Platforms, Effect, false positive risks).
Overall Bot Fighting Efforts
- Bot Fighting Team
  - Inserting obfuscation into existing code
    - Testing, more testing and even more testing. We don’t want to jeopardize obfuscating effort by crashing the whole thing. OTOH – will help to find well-hidden bugs (and bad practices, such as messing with memory) in non-obfuscated code.
    - Testing against AV engines.
  - Monitor public and not-so-public forums (the latter will require infiltrating them) where attacks-on-your-system are discussed, so you can adjust and counter-act. Let attackers play whack-a-mole.
  - Reverse engineering existing attacks
- Unknown Attacks – as discussed above
- Complaints – analysis tools required. Deterministic stuff(!).
- Known Attacks
  - “Monitor and Obtain known attacks (including infiltration of private forums)Monitor and Obtain (including infiltration of private forums)
  - Analyze
  - Fight Back
Summary
- The best we can – is to engage attacker into arms war
  - At the moment, we’re behind. Doesn’t mean it is hopeless.
    - “Given time, everything can be broken” => we should not give time
  - Most of attacks rely on data, while 99% of protections don’t even try to protect data.
- Need to spend LOTS of time on it. Special bot fighting team (often known as Security Team, which is a misnomer).
  - Systemic efforts. Just adding one “protection” won’t help. Weakest link principle, but every bit counts.
  - MUST use basic precautions (including channel encryption to avoid self-MITM)
  - “MUST make source-level effort to obfuscateMUST make source-level effort to obfuscate (most of the time DIY, in theory can be assisted by 3^rd-party libs and tools; Armadillo example)
    - MUST be limited to non-semantic-changing actions such as replacing int with OBF(int,OBFFAST), and adding FORCEINLINE.
      - Changes such as replacing ‘<’ with ‘!=’ for upper bound of the loop, MIGHT be considered as semantically-equivalent, BUT still require extreme care.
    - MUST be massive (and expanding)
    - MUST use randomized codegen (template-based implementation might be ok, as long as it is still pseudo-randomized on a truly random seed generated at build time).
    - MUST obfuscate protocols, both intra-Client and Client-2-Server (for Client-2-Server versioning is a headache, but is worth the trouble).
  - SHOULD use system-level protection (DIY or 3^rd-party).
    - SHOULD be integrated with source-level effort.
    - SHOULD include some kind of “encryption” (actually – scrambling).
    - Beware of AV false positives.
  - MUST have Server-Side statistics with “red flags”
  - MUST have a way (tools+processes+team) to manually analyse player complaints and automated “red flags”
  - MUST monitor published/commercial attacks on your game and issue updates.

[[TODO: interactions with (Re)Actors replays – both as an attack vector and as a way to prove innocence for player]]

[[To Be Continued…

Phew. With just an outline being over 5000 words, this is going to be a huge chapter. I’ll get to the substance of it a few weeks later.

Meanwhile, stay tuned for further parts of Chapter 27, where we’ll continue our discussion on DB optimizations (going into the strange field of app-level caches and app-level replicas)]]

[+]References

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.

Bot Fighting

[[To Be Continued…

[+]References

Acknowledgement

Related posts

Leave a Reply Cancel reply