| Author: | “No Bugs” Hare Follow: |

Job Title: | Sarcastic Architect | |

Hobbies: | Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek | |

[rabbit_ddmog vol=”8″ chap=”Chapter 29(f) from “beta” Volume VIII”]

After we discussed Bot Fighting 101–103 (which were about techniques which are commonly used, but are mostly inefficient against serious hackers), we can proceed to a much more interesting topic: how to increase resilience of our Client-Side program to those Brute-Force Attacks discussed in [[TODO]] section above. To re-iterate – Brute-Force Attacks tend to start with identifying system calls (which are pretty much impossible to hide *completely*) – in the context of MOG it is usually a socket call; after a system call is identified – the attacker goes up the call stack to get to the-place-of-interest-to-him (usually – starting from that-point-where-the-message-to-be-sent-over-the-socket-is-prepared, and going up from there).

At this point, we’ll discuss one serious anti-hacking measure, which is aimed to:

- Keep our
*source*code perfectly readable (and with minimal changes); - Make our
*binary*code as-unreadable-as-possible - Obfuscate in-memory data
- Make sure that even if
*source*code stays the same –*binary*code changes*a lot*on each build. In other words – we’re aiming to have build-time code polymorphism.

It may sound too-good-to-be-true, but apparently, it is perfectly achievable at least in C++; let’s take a closer look at this technique.

## Preliminaries – How Our Source Code Will Look

To illustrate our approach, let’s take an extremely simplistic function which we want to obfuscate:

//Listing 29.factorial.orig uint32_t factorial(uint32_t x) { uint32_t ret = 1; for(size_t i=1; i <= x ; ++i) { ret *= i; } return ret; }

*NB: in our exercises, we’ll be using MSVC 2017 (in Release mode) – as it is probably the most popular compiler for the Client-Side; for other compilers, the results will be similar, though certainly not identical. *

*NB2: for my tests, I used __declspec(noinline) for factorial() to make sure that compiler doesn’t inline it (which would complicate our analysis); it is a TEST-ONLY feature, and as a Bit Fat Rule of Thumb™, in production code we should be very aggressive with inlining, and avoid any __declspec(noinline).*

When we compile our *factorial() *function, we get the following asm:^{1}

;Listing 29.factorial.orig.asm push esi mov esi,1 mov edx,esi cmp ecx,esi jbe l2 nop dword ptr[eax]; aligning next instruction l1: imul esi,edx; ret *= i inc edx; ++i cmp edx,ecx jb l1 l2: mov eax,esi pop esi ret

Here, in spite of all the attempts of the compiler to make things less clear, there is an obvious loop starting from label l1, with a multiplication inside. As the result of the multiplication is multiplied again and again with an ever-incremented counter – it is relatively easy to guess that we’re dealing with calculating factorial here.

Now, let’s see what we can do to make the code less obvious to the attacker, while preserving clarity of our source code. Let’s introduce an *obf<> *template, defined as follows:

//Listing 29.obf template<class T, int ID> //Here, ID is an ‘identifier’ of // the instance of our obfuscation // We DO want to keep them different, don’t we? class obf { T val; public: obf() {} template<class T2> obf(T2 v) { val = v; } T value() const { return val; } operator T() const { return value(); } template<class T2> obf& operator *=(T2 t) { val *= t; return *this; } //NB: in this context, I tend to prefer templatized member functions // to free-standing friends, because it is generally easier // to address issues from improper ordering, than from type // mismatches. And having templatized free-standing friends // is not an option due to ambiguities. OTOH, if you prefer // free-standing non-templatized friends – they’re ok too. //do the same for =, +=, -=, /=, %= // and IF you need them – for bitwise ops too obf& operator ++() { val++; return *this; } obf operator ++(int) { return obf(val++); } //do the same for –- template<class T2> bool operator <(T2 b) { return val < b; } //do the same for +, -, *, /, % // and IF you need them – for bitwise ops too };

Now, we can rewrite our factorial() function into the following:

//Listing 29.factorial.obf obf<uint32_t,123> factorial(obf<uint32_t,124> x) { obf<uint32_t,125> ret = 1; for(obf<size_t,126> i=1; i < x ; ++i) { ret *= i; } return ret; } //BTW, we’re NOT required to have ALL the variables // obfuscated; in fact – it should be done on case-by-case // basis

*NB: let’s keep in mind that while obf<uint32_t> is very close to uint32_t, it is not a ‘perfect wrapper’ (in particular, because of ‘1 built-in + 1 user-defined’ rule for implicit conversions). It means that some changes MIGHT be necessary to the code after we replace type with obf<type> in some declaration – but they’re usually solvable very easily by replacing variable with variable.value(). Also – make sure NOT to add any casts when enabling obf<> types; adding casts opens a huuuuge can of worms able to introduce LOTS of subtle bugs; fortunately – from what I’ve seen, 99% of the code works after replacing type with obf<type> ‘as is’, and adding .value() allows to deal with the rest. *

As we didn’t really change anything, this code still produces the same results when we run it; apparently, asm did change a bit, but the inner loop (the one which reveals all the stuff to the attacker), is still exactly the same.^{2}

Now, we can go a bit further and avoid writing those IDs manually, making an OBF() macro to wrap our obf<> type:^{3}

#define OBF(t) obf<t,__LINE__> //If you feel like it and use C++17, // you can even use constexpr function of (__FILE__,__LINE__) // as an ID; in C++14 it may be possible // but is hardly worth the trouble

Then, our *factorial()* function will start to look as

//Listing 29.obf.macro OBF(uint32_t) factorial(OBF(uint32_t) x) { OBF(uint32_t) ret = 1; for(OBF(size_t) i=1; i < x ; ++i) { ret *= i; } return ret; } //Again, we’re NOT required to have ALL the variables // obfuscated

IMO, it looks *reasonably *close to the original; moreover, I’d argue that it is probably *the best *thing we can have in this regard. In general, we DO need to specify which variables/parameters have to be obfuscated, and moreover – have to specify *how performance-critical *they are. In practice, I usually suggest to use template classes such as obf0<>, obf2<>,… obf5<> etc. (with ‘obf0<>’ meaning ‘obfuscate lightly’, and ‘obf5<>’ meaning ‘obfuscate heavily’), and corresponding macros OBF(), OBF2(),…, OBF5().

By now, we’re done with the changes to our *source *code;^{4} now, let’s see how these (IMO rather minor) changes allow to make the life of the hacker *by orders of magnitude more difficult*.

^{1}All asm in this Chapter uses Intel notation – opposed to AT&T notation; Intel one is also the one provided by MSVC.

^{2}saving for using different registers

^{3}in MSVC17, for some reason __LINE__ is not a constexpr; however, a workaround is possible (see [[TODO]] for implementation)

^{4}defined as “whatever we’re writing manually”

## Take 1 – Naive XOR

Let’s say that we have:

- a script, which extracts a list of all the instances of obf<> and OBF() out of our source files (what we need is just a list of (type,ID) tuples, which is trivial to get)
- another script, which takes this list of (type,ID) tuples and generates randomized instances of obf<type,ID> wrappers.

One extremely simplistic example of such a generated wrapper would look as follows:

//Listing 29.obf.xor //GENERATED RANDOMIZED CODE //CHANGES ON EACH BUILD //DO NOT MODIFY template <> class obf<size_t,126> { //NB: I am using full template specialization here, // but it can be partial specialization too using T=size_t; T val; template<class T2> void init(T2 v) { val = v ^ 0x28f3472a; } public: obf() {} template<class T2> obf(T2 v) { init(v); } T value() const { return val ^ 0x28f3472a; } operator T() const { return value(); } template<class T2> obf& operator *=(T2 t) { init(value() * t); return *this; } obf& operator ++() { init(value()+1); return *this; } obf operator ++(int) { obf ret = obf(value()); init(value()+1); return ret; } //do the same for –- template<class T2> bool operator <(T2 b) { return value() < b; } //do the same for +, -, *, /, % // and IF you need them – for bitwise ops too };

Now, let’s take a look at disassembly. There, we’ll see that *while in Debug mode all the XORs with 0x28f3472a are present – in Release mode they’re completely eliminated by compiler, so our inner loop is still exactly the same, with only imul and inc inside* (!).

Does it mean that at this point, we failed to obfuscate our data and code? Sure. Does it mean that any such obfuscation is hopeless? Certainly not.

## Take 2 – Adding Global volatile

If we go a bit further and generate our wrapper along the following lines:

//Listing 29.obf.xor.global //GENERATED RANDOMIZED CODE //CHANGES ON EACH BUILD //DO NOT MODIFY volatile size_t global126 = 1; template <> class obf<size_t,126> { using T=size_t; T val; template<class T2> void init(T2 v) { val = (v ^ 0x28f3472a) * global126; } //Order is IMPORTANT here: if we first multiply, // and then XOR – compiler will still be able to // eliminate paired XORs //Also, using DIFFERENT operations is IMPORTANT too; // otherwise, compiler may reorder and eliminate things public: obf() {} template<class T2> obf(T2 v) { init(v); } T value() const { return val ^ 0x28f3472a0; } operator T() const { return value(); } //other operators are the same as above };

Holding our breath, we take a look at the asm, and can see that *now *we’re talking! Our innermost loop looks as follows:

;Listing 29.obf.xor.global.asm l1: imul eax,ecx inc ecx xor ecx,28F2372Ah imul ecx,dword ptr[global126] xor ecx,28F2372Ah cmp ecx,dword ptr[x]; on-stack variable jb l1

While it is certainly not obfuscated enough, and the idea behind the code can be still seen rather clearly, we have already managed to get our obfuscation code in.

## Take 3 – ADD instead of XOR

Bijection In mathematics, a bijection, bijective function, or one-to-one correspondence is a function between the elements of two sets, where each element of one set is paired with exactly one element of the other set, and each element of the other set is paired with exactly one element of the first set.— Wikipedia —Now, let’s see what we can do with our *obf<>* class to obfuscate things better. First, let’s note that XOR is actually only one of the ways to obfuscate things (and is pretty poor at that, BTW – as XOR is rarely used in normal code,^{5} using XOR for obfuscation means every XOR will scream loudly ‘I am obfuscation!!’); also – such non-zeroing XORs *may *cause AV heuristics to cause false positives [RaabeBallenthin]. However, there are *lots *of other techniques to make things much less obvious.

Technically, what we’re looking for here, is any kind of *bijection*; we’ll use this *bijection *to convert our data from one representation into another one (and as it is a *bijection*, we can revert it later).^{6}

One very simple example of such bijection is just replacing XOR some-random-constant with ADD some-random-constant; then, our *init() *function will look as val = (v + (size_t)0x28f3472a) * global126 and *value() *function as return val – (size_t)0x28f3472a0.^{7} After compiling such ADD-based obfuscation (mathematically – addition modulo 2^32), The innermost loop in our generated code will be already less obvious than the one with XORs:

;Listing 29.obf.add.global.asm l1: inc ecx imul eax,edx imul ecx,dword ptr[global126] lea edx,[ecx-28F2372Ah] ; a counterpart operation ; was moved out of the loop cmp edx,dword ptr[x]; on-stack variable jb l1

Here, MSVC compiler did some serious optimization relying on the interrelations between our obfuscating addition modulo 2^32^{8} + multiplication-by-global, and intra-loop increment+comparison; it certainly didn’t make things simpler-to-understand, nice job obfuscating our intentions!^{9}

Our Listing 29.obf.add.global.asm already looks significantly better than original, but TBH, we didn’t even start any serious obfuscation stuff. Most importantly –

^{10}

^{5}except for XOR EAX,EAX etc. which is a very obvious special case

^{6}actually, we can even use more generic

*injections*, but it is a little bit more complicated, see [[TODO]] section below

^{7}make sure to use ONLY

*unsigned*types here, as signed overflow is apparently Undefined Behaviour in C++ (and starting from around 2015 or so, compilers started to abuse it, so relying on it

*can*lead to really nasty bugs <very-sad-face />). To obfuscate the signed types – just cast them into unsigned counterpart first.

^{8}or modulo 2^64, depending on sizeof(size_t)

^{9}Of course, obfuscating wasn’t a goal of the compiler, but it just so happens that serious optimizations at least for x86/x64 command set tend to result in serious obfuscation too. This includes lots of

*blending*of different inlined functions – in the example above we cannot really say where one function ends, and another one begins; perfect! <wide-smile />

^{10}for time-critical code, there are also performance considerations, more on them in [[TODO]] section below.

## Take 4 – Throw In Multiply-by-Odd

Our next very-simple-but-efficient obfuscation is related to multiplication-by-an-odd-constant-modulo-2^32^{11} (and reverse operation is a multiplication-by-another-odd-constant-modulo-2^32). Explanation *why *it is a bijection, is beyond the scope of this book (very very sketchy – it relies on a fact that *any* odd-constant and 2^32 are co-primes).

*How *to calculate inverse-constant (given original constant), is not-so-trivial, and we’ll discuss it in [[TODO]] section below, but for the time being, we’ll use a pre-calculated one. Let’s observe that 13*17*97*315643*2829487 mod 2^32 = 1.

This means that we can write our *init() *function using *any *of the 5-numbers-listed-above, and our *value() *function – using *all *the remaining 5 numbers, for example:

//Listing 29.obf.mul //GENERATED RANDOMIZED CODE //CHANGES ON EACH BUILD //DO NOT MODIFY template <> class obf<size_t,126> { using T=size_t; T val; static_assert(sizeof(size_t)==4,””); template<class T2> void init(T2 v) { val = v *(size_t)17*(size_t)97*(size_t)315643; } public: obf() {} template<class T2> obf(T2 v) { init(v); } T value() const { return val * (size_t)13*(size_t)2829487; } operator T() const { return value(); } //other operators are the same as above };

When compiled it leads to the following asm code in the innermost loop of our *factorial() *function:

;Listing 29.obf.mul.asm l1: mov esi,ecx inc edi imul esi,eax add ecx,1F0620CBh imul eax,esi,23144E3h cmp edi,edx jb l1

Well, as we can see – even without globals it doesn’t look too obvious; sure – there is a loop, *and *there is a multiplication within the loop, but the other stuff (such as the *second *multiplication) and strange-looking constants, is not really obvious (and while it is merely obfuscation, it is not “dead code” either, so it cannot be easily eliminated by dead-code analysis).

^{11}of course, 2^16, 2^64 etc. will also do

## Take 5 – (kinda)-Feistel Round

The last primitive which we’ll discuss today (with *many *more to follow in [[TODO]] section below), is a (kinda)-Feistel-round, known from cryptography. Classical Feistel round looks as follows: we take our variable, split it into two equal parts bit-wise, then we feed the “left” part to *some *(not-necessarily-bijection(!)) “round function” f(), and then XOR it with the “right” part; the result consists of the “right” part (unmodified), and “left” part (XOR-ed with *f(right_part)*). We’ll do *almost *the same, just replacing too-obvious-in-the-code-XOR with addition modulo 2^32 (and using x^2 as our “round function” f()):

//Listing 29.obf.Feistel //GENERATED RANDOMIZED CODE //CHANGES ON EACH BUILD //DO NOT MODIFY template <> class obf<size_t,126> { using T=size_t; static_assert(sizeof(size_t)==4,””); uint16_t right; uint16_t left; uint16_t f(uint16_t x) const { return x*x; } template<class T2> void init(T2 v) { right = (uint16_t)v; left = (uint16_t)(v >> 16) + f(right); } public: obf() {} template<class T2> obf(T2 v) { init(v); } T value() const { return (uint32_t)((left-f(right)) << 16) + (uint32_t)right; } operator T() const { return value(); } //other operators are the same as above }; static_assert(sizeof(obf<size_t,126>)==4,””);

With this code, the inner loop of our *factorial() *function happens to look as follows:

l1: mov edx,esi imul edx,esi movzx esi,si movzx edx,dx sub ebx,edx shl ebx,10h add esi,ebx imul eax,esi lea edi,[esi+1] movzx edx,di mov esi,edi imul esi,edi mov dword ptr[ebp-4],edx mov edx,edi shr edx,10h add edx,esi movzx ebx,dx movzx edx,si mov esi, ebx sub esi,edx movzx edx,di shl esi,10h add esi,edx cmp esi,ecx mov esi,dword ptr[ebp-4] jb l1

Good luck finding our original *factorial()* logic here (there is still a loop-with-imul-inside, but the rest is not exactly clear to put it mildly)…

BTW, it is still NOT the-result-we’re-aiming-for; rather – it is just an illustration of the building blocks for the real stuff; stay tuned for the rest! <wink />

[[TODO: data protection vs common attacks such as [https://blog.badtrace.com/post/reverse-engineering-a-csgo-cheat/] ]]

## [[To Be Continued…

This concludes beta Chapter 29(f) from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”.

Stay tuned for Chapter 29(g), where we’ll proceed to the question of “how to obfuscate things in C++17 alone, without any external compiler” – and will get ORDERS OF MAGNITUDE more sophisticated than the above – while keeping our source code readable along the lines above]]

**Comment↯ below**. You do?! Please share:

### References

### Acknowledgement

Cartoons by Sergey Gordeev^{} from Gordeev Animation Graphics, Prague.

Goose says

Is it Sergey I. or Sergey G. who think up these images?

"No Bugs" Hare says

Usually, it is ‘No Bugs’ who comes up with a _textual_ description, and Sergey G who translates it into lines-on-paper. Kinda as a movie screenwriter and director ;-).