Bot Fighting 201, part 3. ithare::obf: An Open Source Data+Source Randomized Obfuscation Library

 
Author:  Follow: TwitterFacebook
Job Title:Sarcastic Architect
Hobbies:Thinking Aloud, Arguing with Managers, Annoying HRs,
Calling a Spade a Spade, Keeping Tongue in Cheek
 
 
Obfuscation: Growing Forest to Hide a Leaf

[rabbit_ddmog vol=”8″ chap=”Chapter 29(h) from “beta” Volume VIII”]

What a fiendish cipher it must be, to talk about it left, right, and center!

— from Cyberiad by Stanislaw Lem —

With all those considerations which we discussed over the course of last 2 posts, an obvious question of “how to use it in practice” arises. To illustrate those points I’ve made, I wrote a pure C++ template-based obfuscation library, and placed it to Github [NoBugs]. Note that it is a pure C++ solution, i.e. without an external code generator; while abilities of the pure C++ are limited compared to the code generator-based solutions, I was surprised how much it is possible to achieve while staying entirely within bounds of C++.1

DISCLAIMER: as of now, library is in pre-alpha stage; while I am reasonably confident that if it compiles, it will produce consistent results (i.e. output with obfuscation will be the same as output without obfuscation), chances of it failing to compile in certain use cases are still pretty high <sad-face />. Still, as a proof-of-concept it IS interesting to see the results which can be obtained even with a pure in-language solution.


1 C++17, to be exact; don’t even try it without MSVC2017, or without /std:c++latest. While support for modern Clang is hopefully possible, it is not there yet.

 

Basic Idea

As it was discussed above –

I am a Big Fan™ of automated randomized code generation.2

In other words – the idea is NOT to write obfuscations manually. Instead – we should just declare which-things-we-want-to-obfuscate, and then obfuscation library should take care of the rest.

Example: factorial() function

To illustrate “how ithare::obf can be used to obfuscate things”, we’ll use the following very simplistic factorial() function:

//Listing 29.factorial.orig
__declspec(noinline) int64_t factorial(int64_t x) {
  if(x<0)
    throw MyException(“Negative value of factorial!”);

  int64_t ret = 1;
  for(int64_t i=1; i <= x; ++i) {
    ret *= i;
  }

  return ret;
}

Note: above, __declspec(noinline) is used merely to isolate our factorial() function from the rest of code, to see what is going on more clearly. With a few very narrow exceptions, noinline SHOULD NOT be used for production code (leave alone obfuscated production code).

 

Baseline

After compiling the code above in Release x64 mode, we get the following assembly (as shown by IDA Pro; other disassemblers may provide less information, but we should be ready to the attacker using the-best-available-tools):

;;Listing 29.factorial.orig.asm
;;single-; indicates comments by IDA Pro
;;double-;; indicate my own comments
var_38 = byte ptr -38h;;offset identifier created by IDA Pro
sub rsp,58h
mov rdx,rcx
test rcx,rcx
js l1
mov eax,1
mov ecx,eax
cmp rax,rdx
jg l2
nop dword ptr[rax+rax+00000000h];;aligning instructions
l3: imul rax,rcx
inc rcx
cmp rcx,rdx
jle l3
l2: add rsp,58h
retn
l1: lea rcx,[rsp+58h+var_38]
call sub_2
lea rdx,[__TI1?AVMyException@@]; throw info
                               ;    for ‘class MyException’
lea rcx,[rsp+58h+var_38]
call _CxxThrowException

sub_2:
push rbx
sub rsp, 20h
lea rax,off_1400036F90
mov rbx, rcx
mov [rcx], rax
lea rdx, aNegativeArgume ; “Negative argument to factorial!”
add rcx, 8
call sub_3
mov rax, rbx
add rsp, 20h
pop rbx
retn

It is very short, and is rather clear even in asm. NB: as we can see, MyException name is visible in the .exe – that’s in spite of me disabling RTTI (RTTI seems to be enabled at least for classes-being-thrown regardless of /GR- option being fed to MSVC compiler <sad-face />).

Even better for the potential attacker (and worse for us), is that if we feed this code to the IDA Pro decompiler, our intentions will be even more obvious:

//Listing 29.factorial.obf.decompiled.c
signed __int64 __fastcall sub_1(signed __int64 a1)
{
  signed __int64  v1;//rdx
  signed __int64 result;//rax
  signed __int64 i;//rcx
  char v4;// [rsp+20h] [rbp-38h]

  v1 = a1;
  if(a1<0)
  {
    sub_2(&v4);
    CxxThrowException(&v4, &_TI1_AVMyException__);
  }

  result = 1i64;
  for( i = 1i64; i <= v1; ++i )
    result *= i;
  return result;
}

_QWORD *__fastcall sub_2(_QWORD *a1)
{
  _QWORD *v1;// rbx

  v1 = a1;
  *a1 = &off_140036F90;
  sub_3(a1 + 1, “Negative argument to factorial!”);
  return v1;
}

Aren’t you amazed at capabilities of IDA’s reverse engineering? I certainly am; it indeed looks extremely close to the original code, and our intentions are indeed very clear. Even if we wouldn’t give away that obvious clue with our “Negative argument…” error string – the attacker still won’t have much problems realizing that our sub_1() is a function which calculates factorial (that for(i) loop in sub_1() gives it away almost-instantly).

This is the point where our obfuscation task starts…

Obfuscated-but-Still-Readable-and-Debuggable?

As it was stated above, our goal is to obfuscate the binary code, while keeping our source code as-readable-as-possible. Let’s see how ithare::obf from library [NoBugs] deals with it.

To enable automated randomized obfuscation, we have to say which variables and which literals we want to obfuscate, and to what extent (leaving generation of the actual obfuscation code to the library and compiler). Let me show it on example of our factorial() function:

//Listing 29.factorial.obf
__declspec(noinline) OBF6(int64_t) factorial(OBF6(int64_t) x) {
  if(x<0)
    throw MyException(OBF5S(“Negative value of factorial!”));

  OBF3(int64_t) ret = 1;//we could obfuscate literal '1' too, 
                        //  but for such common literals it rarely makes sense
  for(OBF3(int64_t) i=1; i <= x; ++i) {
    ret *= i;
  }

  return ret;
}

That’s it! As you can see, we didn’t do any real obfuscation here, merely declaring which variables have to be obfuscated (in fact, we’re replacing types-to-be-obfuscated with their obfuscated counterparts).

About those digits within OBF6() and OBF3(). In ithare::obf there is a convention that this-digit-after-OBF means “how much we want to obfuscate this type/literal”: OBF0() means “obfuscate using no more than 1 CPU cycle”, OBF1() – “using no more than 3 CPU cycles”, OBF2() – “no more than 10 CPU cycles”, and so on. More generally, OBFX() means “obfuscate using no more than 10^(X/2) CPU cycles”, so OBF6() is allowed to eat up to 1000 CPU cycles.

Obfuscating Parameters and Return Values

In our example above, we obfuscated not only internal variables, but also input parameters and return values of our factorial() function (and moreover, we used MUCH more severe obfuscation on entry and on exit, than inside). The reason for it is that

We want to obfuscate not only the nature of the function, but also its interfaces.

This provides two very significant benefits:

  • When debugging, attacker cannot really see real values of the function parameters (he will see only those obfuscated values). As call stack is one of major tools used by attackers, making it less obvious (and we’ll be making it much less obvious – see decompiled examples below) certainly qualifies a Good Thing™.
  • Hare thumb up:Even if the attacker knows what our function does, to invoke the function properly he still needs to deobfuscate the APIEven if the attacker knows what our function does, to invoke the function properly he still needs to deobfuscate the API (i.e. reverse-engineer current randomized incarnation of the code). And as this obfuscation changes automagically on each build (more on it below) – it means that each successful-attack-which-uses-this-API, still has to be rewritten for each new release of our Client. Sounds as a lot of tedious mundane work for the attacker – and this is certainly another Good Thing™ for us as defenders.3

2 That is, for obfuscation purposes.
3 Of course, in theory this process can be automated, but for the time being – there are no such tools. Moreover – even in theory there are limits to such automation, especially as boundaries between real code and obfuscation are far from being obvious(!). Those reverse-engineering guys I spoke to, when faced with such automation, were referring to using SAT solvers for this task, and with SAT solvers being NP-complete – well, it is a pretty hopeless mechanism of dealing with sizeable problems.

 

Debugging

As we can observe, our source code is not too obfuscated. Moreover, if we compile it for debugging (i.e. without ITHARE_OBF_SEED being defined) – it will stay debuggable too; here is a screenshot from MSVC with our factorial() function being debugged:

debugging obfuscated code in Visual Studio

As we can see, in this mode (without ITHARE_OBF_SEED), all the variables look more or less as their counterparts (that added ‘val=’ to the value is not TOO bad debugging-wise). Of course, we won’t be debugging our obfuscation this way – but when debugging our app-level logic, we have to assume that whatever-obfuscation-we-have works flawlessly anyway.4


4 BTW, even now, in pre-alpha, I am reasonably confident that as soon as obfuscated code compiles – it does work properly (and bugs causing compile failures, are being steadily ironed out).

 

Compile for Production: Enter ITHARE_OBF_SEED

Now, let’s try to compile the very same Listing 29.factorial.obf for production; to do it – we have to recompile exactly the same source code but with ITHARE_OBF_SEED #defined to be a large random number.5

At this point, we get an executable, which is functionally equivalent to the original one – but is randomly and severely obfuscated. Let’s see what current (as of January 2018) version of the ithare::obf will do with our factorial() function.

From assembler, the following can be observed:

  • Our “Negative value of factorial!” literal is no longer easily visible (it doesn’t show in “Strings” view of IDA Pro, and is not really easily detectable until the moment we decide to use it).
  • Our simplistic factorial() function now takes whopping 2 kilobytes(!).

As it is sooooo large (and is not-so-easy-to-comprehend) – I will provide only a portion of it (specifically the inner loop) here:

;;Listing 29.factorial.obfuscated.asm.InnerLoopOnly
loc_140002430:
mov r9, r15
mov r10d, 1Fh
sub r9, rax
imul rax, rcx, 1Fh
mov r8d, r9d
sub r10, rax
and r8d, 0FFFF0000h
movzx eax, r9w
add r8d, eax
mov eax, r9d
sub r8, rax
add r8, r9
imul r10, r8
mov edx, r10d
movzx eax, r10w
and edx, 0FFFF0000h
add edx, eax
mov eax, r10d
sub rax, rdx
mov rdx, cs:qword_14003FAA0
sub rax, r10
add rax, r15
movzx r8d, byte ptr [rdx+2]
imul r8, -1Fh
imul rdx, rcx, -1Fh
add r8, rbx
mov rcx, r8
mov r8d, 1Fh
inc rdx
imul rcx, rdx
imul rdx, rcx, 1Fh
sub r8, rdx
cmp r8, r11
jle short loc_140002430

As we can see – it is far from being obvious; while we do have an imul operation within – it doesn’t give up much, as actually we have six of such imuls, and it isn’t obvious whether they’re part of obfuscation, or not.

And even if we look at the our obfuscated factorial() function through the prism of almighty IDA Pro’s decompiler – all we get is the following:

//Listing 29.factorial.obf.decompiled.c
_DWORD *__fastcall sub_1(_DWORD *a1, __int64 a2)
{
  int v2; // er14
  __int16 v3; // ax
  int v4; // er9
  int v5; // er11
  __int16 v6; // r10
  int v7; // er13
  int v8; // eax
  unsigned __int64 v9; // r8
  unsigned __int64 v10; // rax
  signed __int64 v11; // rcx
  int v12; // er8
  int v13; // er11
  int v14; // edi
  int v15; // edi
  int v16; // er11
  unsigned __int64 v17; // r11
  int v18; // er10
  unsigned int v19; // ebx
  int v20; // edi
  int v21; // er11
  unsigned __int64 v22; // rsi
  unsigned __int64 v23; // r10
  int v24; // er9
  unsigned int v25; // edi
  int v26; // ecx
  int v27; // er10
  int v28; // er9
  int v29; // er11
  unsigned __int64 v30; // rdx
  _DWORD *result; // rax
  unsigned __int64 v32; // rdx
  __int16 v33; // ax
  int v34; // er8
  __int64 v35; // r8
  _DWORD *v36; // [rsp+48h] [rbp-51h]
  char v37; // [rsp+50h] [rbp-49h]
  char v38; // [rsp+54h] [rbp-45h]
  char v39; // [rsp+58h] [rbp-41h]
  char v40; // [rsp+5Ch] [rbp-3Dh]
  char v41; // [rsp+60h] [rbp-39h]
  __int64 v42; // [rsp+70h] [rbp-29h]
  __int64 v43; // [rsp+78h] [rbp-21h]
  char v44; // [rsp+80h] [rbp-19h]
  unsigned int v45; // [rsp+A8h] [rbp+Fh]
  int v46; // [rsp+ACh] [rbp+13h]
  unsigned int v47; // [rsp+B0h] [rbp+17h]
  int v48; // [rsp+B4h] [rbp+1Bh]
  int v49; // [rsp+B8h] [rbp+1Fh]
  int v50; // [rsp+BCh] [rbp+23h]
  unsigned int v51; // [rsp+C0h] [rbp+27h]
  unsigned int v52; // [rsp+C4h] [rbp+2Bh]

  v36 = a1;
  v2 = -256 * BYTE5(a2) - ((unsigned __int16)(15 - HIWORD(a2)) << 16) - BYTE4(a2) + 31;
  v3 = (unsigned __int64)((unsigned int)(unsigned __int16)(HIWORD(v2)
                                                         - (-256 * BYTE5(a2) - BYTE4(a2) + 31)
                                                         * (-256 * BYTE5(a2) - BYTE4(a2) + 31))
                        - 15
                        + (v2 << 16)) >> 16;
  v4 = (((unsigned __int16)(HIWORD(v2) - (-256 * BYTE5(a2) - BYTE4(a2) + 31) * (-256 * BYTE5(a2) - BYTE4(a2) + 31))
       - 15
       + (v2 << 16)) << 16)
     + (unsigned __int16)(v3
                        - (HIWORD(v2) - (-256 * BYTE5(a2) - BYTE4(a2) + 31) * (-256 * BYTE5(a2) - BYTE4(a2) + 31) - 15)
                        * (HIWORD(v2) - (-256 * BYTE5(a2) - BYTE4(a2) + 31) * (-256 * BYTE5(a2) - BYTE4(a2) + 31) - 15));
  v5 = (unsigned __int16)((HIWORD(v4) << 8) + (((HIWORD(v4) >> 8) - HIWORD(v4)) & 0xFF))
     + ((((unsigned __int8)(3
                          - 15
                          * ((unsigned __int16)(v3
                                              - (HIWORD(v2)
                                               - (-256 * BYTE5(a2) - BYTE4(a2) + 31)
                                               * (-256 * BYTE5(a2) - BYTE4(a2) + 31)
                                               - 15)
                                              * (HIWORD(v2)
                                               - (-256 * BYTE5(a2) - BYTE4(a2) + 31)
                                               * (-256 * BYTE5(a2) - BYTE4(a2) + 31)
                                               - 15)) >> 8)) << 8)
       + 65521
       + (unsigned __int8)(v3
                         - (BYTE2(v2) - (31 - BYTE4(a2)) * (31 - BYTE4(a2)) - 15)
                         * (BYTE2(v2) - (31 - BYTE4(a2)) * (31 - BYTE4(a2)) - 15))) << 16);
  v6 = (unsigned __int16)(31 - ((unsigned __int8)-HIBYTE(v5) << 8) - (unsigned __int8)(3 - BYTE2(v5))) >> 8;
  v7 = ((_DWORD)a2 << 16) + (unsigned __int16)(WORD1(a2) - a2); v8 = (unsigned __int64)(v7 + (unsigned __int16)(WORD1(a2) - a2 - 3) - (unsigned int)(unsigned __int16)(WORD1(a2) - a2)) >> 16;
  v9 = 3i64
     - 3
     * (((v7 + (unsigned __int16)(WORD1(a2) - a2 - 3) - (unsigned __int16)(WORD1(a2) - a2)) << 16)
      + (unsigned int)(unsigned __int16)(v8 - (WORD1(a2) - a2 - 3) * (WORD1(a2) - a2 - 3)))
     - ((unsigned __int64)(3
                         * ((unsigned __int16)((unsigned __int8)(3 - v6)
                                             + (((unsigned __int8)(1 - (31 - (3 - BYTE2(v5)))) + ((31 - (_WORD)v5) << 8)) << 8))
                          - (unsigned __int16)(15
                                             - ((unsigned __int8)(3 - v6)
                                              + (((unsigned __int8)(1 - (31 - (3 - BYTE2(v5)))) + ((31 - (_WORD)v5) << 8)) << 8)))
                          - ((unsigned __int8)(3 - v6)
                           + (((unsigned __int8)(1 - (31 - (3 - BYTE2(v5))))
                             + ((unsigned int)(unsigned __int16)(31
                                                               - ((HIWORD(v4) << 8) + (((HIWORD(v4) >> 8) - HIWORD(v4)) & 0xFF))) << 8)) << 8))
                          + 5)) << 32);
  if ( (signed __int64)((v9 << 32)
                      + HIDWORD(v9)
                      - (3
                       - 3
                       * (((v7 + (unsigned __int16)(WORD1(a2) - a2 - 3) - (unsigned __int16)(WORD1(a2) - a2)) << 16)
                        + (unsigned __int16)(v8 - (WORD1(a2) - a2 - 3) * (WORD1(a2) - a2 - 3))))
                      * (3
                       - 3
                       * (((v7 + (unsigned __int16)(WORD1(a2) - a2 - 3) - (unsigned __int16)(WORD1(a2) - a2)) << 16)
                        + (unsigned int)(unsigned __int16)(v8 - (WORD1(a2) - a2 - 3) * (WORD1(a2) - a2 - 3))))) < 0 ) { v45 = 983040 * (((unsigned int)(3 * dword_14003DB80) >> 16) + ((unsigned __int16)(3 * dword_14003DB80) << 16)) + (unsigned __int16)((15 * (((unsigned int)(3 * dword_14003DB80) >> 16)
                             + ((unsigned __int16)(3 * dword_14003DB80) << 16)) >> 16)
                           - abs(15 * ((unsigned int)(3 * dword_14003DB80) >> 16)));
    v46 = 31
        * (((unsigned __int16)-(signed __int16)(((unsigned __int16)-(signed __int16)dword_14003DB84 >> 8)
                                              + ((unsigned __int8)-(char)dword_14003DB84 << 8)) << 16) + (unsigned __int16)-((-65536 * (unsigned __int16)(3 - HIWORD(dword_14003DB84)) - (unsigned int)(unsigned __int16)(((unsigned __int16)-(signed __int16)dword_14003DB84 >> 8)
                                                               + ((unsigned __int8)-(char)dword_14003DB84 << 8)) + 1) >> 16));
    v33 = (unsigned __int64)(unsigned int)-dword_14003DB88 >> 16;
    v47 = 15
        * (((-65536 * dword_14003DB88
           + (unsigned int)(unsigned __int16)(v33 - -(signed __int16)dword_14003DB88 * -(signed __int16)dword_14003DB88)) >> 16)
         + ((unsigned __int16)(-(signed __int16)dword_14003DB88 * -(signed __int16)dword_14003DB88 - v33) << 16));
    v34 = ((((unsigned __int16)-HIWORD(dword_14003DB8C) << 16)
          + (unsigned __int16)(15
                             * ((unsigned __int8)dword_14003DB8C + ((unsigned __int8)(1 - BYTE1(dword_14003DB8C)) << 8)))) << 16)
        + (unsigned __int16)(((((unsigned __int16)-HIWORD(dword_14003DB8C) << 16)
                             + (unsigned int)(unsigned __int16)(15
                                                              * ((unsigned __int8)dword_14003DB8C
                                                               + ((unsigned __int8)(1 - BYTE1(dword_14003DB8C)) << 8)))) >> 16)
                           - abs(
                               15
                             * ((unsigned __int8)dword_14003DB8C + ((unsigned __int8)(1 - BYTE1(dword_14003DB8C)) << 8))));
    v48 = (v34 << 16)
        + (unsigned __int16)(HIWORD(v34)
                           - abs(
                               ((((unsigned __int16)-HIWORD(dword_14003DB8C) << 16)
                               + (unsigned int)(unsigned __int16)(15
                                                                * ((unsigned __int8)dword_14003DB8C
                                                                 + ((unsigned __int8)(1 - BYTE1(dword_14003DB8C)) << 8)))) >> 16)
                             - abs(
                                 15
                               * ((unsigned __int8)dword_14003DB8C + ((unsigned __int8)(1 - BYTE1(dword_14003DB8C)) << 8))))); v49 = -65536 * (unsigned __int16)(4 - HIWORD(dword_14003DB90)) - (unsigned __int16)(((unsigned __int16)(45 - 15 * dword_14003DB90) >> 8)
                           + ((unsigned __int8)(45 - 15 * dword_14003DB90 - 15) << 8))
        + 1;
    v42 = 0i64;
    v43 = 15i64;
    v41 = 0;
    v35 = (((unsigned __int16)-HIWORD(dword_14003DB94) << 16)
         + (unsigned int)(unsigned __int16)(15
                                          * ((unsigned __int8)dword_14003DB94
                                           + ((unsigned __int8)(1 - BYTE1(dword_14003DB94)) << 8)))) << 16;
    v50 = v35
        + (unsigned __int16)(((((unsigned __int16)-HIWORD(dword_14003DB94) << 16)
                             + (unsigned int)(unsigned __int16)(15
                                                              * ((unsigned __int8)dword_14003DB94
                                                               + ((unsigned __int8)(1 - BYTE1(dword_14003DB94)) << 8)))) >> 16)
                           - abs(
                               15
                             * ((unsigned __int8)dword_14003DB94 + ((unsigned __int8)(1 - BYTE1(dword_14003DB94)) << 8)))); v51 = -15 * (unsigned __int16)qword_14003DB98 - 983040 * ((unsigned int)qword_14003DB98 >> 16);
    v52 = (HIDWORD(qword_14003DB98) & 0xFFFF0000) + (unsigned __int16)(45 - 15 * WORD2(qword_14003DB98));
    sub_2(
      &v41,
      31i64,
      v35,
      &v45,
      __PAIR__(dword_14003DB94, dword_14003DB8C),
      __PAIR__(dword_14003DB90, dword_14003DB84),
      qword_14003DB98);
    sub_3(&v44, &v41);
    CxxThrowException(&v44, &_TI1_AVMyException__);
  }
  v10 = 14i64;
  v11 = 930i64 * *(unsigned __int8 *)(qword_14003FAA0 + 2) + 1190112520884487202i64;
  v12 = (unsigned __int64)((unsigned int)(unsigned __int16)(HIWORD(v2) - v2 * v2) - 15 + (v2 << 16)) >> 16;
  v13 = (((unsigned __int16)(HIWORD(v2) - v2 * v2) - 15 + (v2 << 16)) << 16)
      + (unsigned __int16)(v12 - (HIWORD(v2) - v2 * v2 - 15) * (HIWORD(v2) - v2 * v2 - 15));
  v14 = (unsigned __int16)((HIWORD(v13) << 8) + (((HIWORD(v13) >> 8) - HIWORD(v13)) & 0xFF))
      + ((((unsigned __int8)(3
                           - 15
                           * ((unsigned __int16)(v12 - (HIWORD(v2) - v2 * v2 - 15) * (HIWORD(v2) - v2 * v2 - 15)) >> 8)) << 8)
        + 65521
        + (unsigned __int8)(v12 - (BYTE2(v2) - v2 * v2 - 15) * (BYTE2(v2) - v2 * v2 - 15))) << 16);
  v15 = (unsigned __int8)(3
                        - ((unsigned __int16)(31
                                            - ((unsigned __int8)-HIBYTE(v14) << 8) - (unsigned __int8)(3 - BYTE2(v14))) >> 8))
      + (((unsigned __int8)(1 - (31 - (3 - BYTE2(v14)))) + ((unsigned __int16)(31 - v14) << 8)) << 8);
  v16 = ((_DWORD)a2 << 16)
      + (unsigned __int16)(WORD1(a2) - a2)
      + (unsigned __int16)(WORD1(a2) - a2 - 3)
      - (unsigned __int16)(WORD1(a2) - a2);
  v17 = 3i64
      - 3 * ((v16 << 16) + (unsigned int)(unsigned __int16)(HIWORD(v16) - (WORD1(a2) - a2 - 3) * (WORD1(a2) - a2 - 3)))
      - ((unsigned __int64)(3 * ((unsigned __int16)v15 - (unsigned int)(unsigned __int16)(15 - v15) - v15 + 5)) << 32);
  if ( 31 - 31 * v11 <= (signed __int64)((v17 << 32) + (unsigned int)(HIDWORD(v17) - v17 * v17)) )
  {
    v18 = (unsigned __int16)(HIWORD(v2) - v2 * v2) + (v2 << 16) - 15;
    v19 = ((v18 << 16) + (unsigned int)(unsigned __int16)(HIWORD(v18) - v18 * v18)) >> 16;
    v20 = (unsigned __int16)(((_WORD)v19 << 8)
                           + ((((unsigned __int16)(((v18 << 16) + (unsigned int)(unsigned __int16)(HIWORD(v18) - v18 * v18)) >> 16) >> 8)
                             - v19) & 0xFF))
        + ((((unsigned __int8)(3 - 15 * ((unsigned __int16)(HIWORD(v18) - v18 * v18) >> 8)) << 8)
          + 65521
          + (unsigned __int8)(BYTE2(v18) - v18 * v18)) << 16);
    v21 = (unsigned __int8)(3
                          - ((unsigned __int16)(31
                                              - ((unsigned __int8)-HIBYTE(v20) << 8) - (unsigned __int8)(3 - BYTE2(v20))) >> 8))
        + (((unsigned __int8)(1 - (31 - (3 - BYTE2(v20))))
          + ((unsigned __int16)(31
                              - (((_WORD)v19 << 8)
                               + ((((unsigned __int16)(((v18 << 16) + (unsigned int)(unsigned __int16)(HIWORD(v18) - v18 * v18)) >> 16) >> 8)
                                 - v19) & 0xFF))) << 8)) << 8);
    do
    {
      v23 = (15
           - v10
           + (unsigned __int16)(15 - v10)
           + ((15 - (_DWORD)v10) & 0xFFFF0000)
           - (unsigned __int64)(unsigned int)(15 - v10))
          * (31 - 31 * v11);
      v10 = ((unsigned __int16)(15 - v10) + ((15 - (_DWORD)v10) & 0xFFFF0000)) * (31 - 31 * (_DWORD)v11)
          - (unsigned __int64)((unsigned __int16)v23 + ((unsigned int)v23 & 0xFFFF0000))
          - v23
          + 15;
      v11 = (-31 * v11 + 1) * (1190112520884487201i64 - 31i64 * *(unsigned __int8 *)(qword_14003FAA0 + 2));
      v22 = 3i64
          - 3
          * (((((_DWORD)a2 << 16)
             + (unsigned __int16)(WORD1(a2) - a2)
             + (unsigned __int16)(WORD1(a2) - a2 - 3)
             - (unsigned __int16)(WORD1(a2) - a2)) << 16)
           + (unsigned int)(unsigned __int16)(((((_DWORD)a2 << 16) + (unsigned __int16)(WORD1(a2) - a2) + (unsigned __int16)(WORD1(a2) - a2 - 3) - (unsigned int)(unsigned __int16)(WORD1(a2) - a2)) >> 16)
                                            - (WORD1(a2) - a2 - 3) * (WORD1(a2) - a2 - 3)))
          - ((unsigned __int64)(3 * ((unsigned __int16)v21 - (unsigned int)(unsigned __int16)(15 - v21) - v21 + 5)) << 32);
    }
    while ( 31 - 31 * v11 <= (signed __int64)((v22 << 32) + (unsigned int)(HIDWORD(v22) - v22 * v22)) ); } v24 = ((15 - v10 + (unsigned __int16)(15 - v10) + ((15 - (_DWORD)v10) & 0xFFFF0000) - (unsigned __int64)(unsigned int)(15 - v10)) >> 32)
      * (((unsigned __int16)(word_14003DB70 - 16916) << 16)
       + (unsigned __int16)(word_14003DB68
                          + 31518
                          + (unsigned __int8)(93 * (word_14003DB68 + 30) - 67 * *(_BYTE *)(qword_14003FAA0 + 2) - 74)
                          - (unsigned __int8)(word_14003DB68 + 30)));
  v25 = (((((unsigned __int16)(21587 - word_14003DB64 - 1) << 16) + (unsigned __int16)(-31 * (-31539 - word_14003DB60))) << 16)
       + (unsigned __int16)(((((unsigned __int16)(21587 - word_14003DB64 - 1) << 16) + (unsigned int)(unsigned __int16)(-31 * (-31539 - word_14003DB60))) >> 16)
                          - -31 * (-31539 - word_14003DB60) * -31 * (-31539 - word_14003DB60)))
      * ((unsigned __int16)(-21845 * HIWORD(v24))
       + ((unsigned __int16)((unsigned __int8)v24 - (unsigned __int8)(15 - v24) - v24) << 16));
  sub_4(&v38, &v37);
  v26 = 31 * ((unsigned __int64)sub_5(&v40, &v39) + 2032094506);
  v29 = ((unsigned __int16)(HIWORD(v26) - word_14003DB6C) + ((unsigned __int16)(31 * v26) << 16))
      * (v28 + ((v27 + (unsigned __int16)abs((_WORD)v28)) << 16));
  v30 = ((unsigned __int64)((v29 & 0xFFFF0000) + (unsigned __int16)v29) << 32) + (v25 >> 16)
      + ((v25 + (unsigned __int16)abs(HIWORD(v25))) << 16)
      + 31;
  result = v36;
  v32 = ((unsigned __int64)(-1431655765 * (_DWORD)v30
                          + (unsigned __int16)(21845 * v30)
                          - (unsigned __int16)(-21845 * v30)
                          + (unsigned __int16)(15 - 21845 * v30)
                          - (unsigned int)(unsigned __int16)(21845 * v30)) << 32)
      + (unsigned int)-((HIDWORD(v30) << 16) + 65537 * (HIDWORD(v30) >> 16));
  *v36 = HIDWORD(v32);
  v36[1] = ((v32 >> 32) + ((unsigned __int64)(unsigned int)v32 << 32)) >> 32;
  return result;
}
This is all what the-best-available-decompiler was able to infer from our automagically-obfuscated code

Not much if you ask me <wink />; figuring out what this function does – is very non-obvious to say the least (of course, it still does compute factorial, it is just that finding it out will take some hours – if not days). And the function-which-deals-with-our-exception (and which previously gave away the “Negative…” literal) now looks as follows:

//Listing 29.factorial.obf.sub_3.decompiled.c
_QWORD *__fastcall sub_3(_QWORD *a1, unsigned __int64 *a2)
{
  unsigned __int64 *v2; // rbx
  _QWORD *v3; // rdi
  unsigned __int64 v4; // rdx
  unsigned __int64 v5; // rcx
  unsigned __int64 v6; // rdx
  unsigned __int64 v7; // rax

  v2 = a2;
  v3 = a1;
  *a1 = &off_140037FC0;
  sub_6(a1 + 1);
  v4 = v2[3];
  if ( v4 >= 0x10 )
  {
    v5 = *v2;
    v6 = v4 + 1;
    if ( v6 >= 0x1000 )
    {
      if ( v6 + 39 <= v6 || v5 & 0x1F || (v7 = *(_QWORD *)(v5 - 8), v7 >= v5) || v5 - v7 - 8 > 0x1F )
        sub_7();
      v5 = *(_QWORD *)(v5 - 8);
    }
    sub_8(v5);
  }
  v2[2] = 0i64;
  v2[3] = 15i64;
  *(_BYTE *)v2 = 0;
  return v3;
}

We can see no literal in plain sight anymore – which is certainly a Good Thing(tm)…

In addition, none of the code in our factorial() function is relatively-easily-detectable “dead code”: all the code can be executed (and the code along those branches which don’t raise exceptions, is executed during normal operation of the program).

Bottom line:

While it is not 100% incomprehensible (nothing is), we did manage to obtain a very poorly readable machine code out of our rather-readable source.

This means that using this type of obfuscation, we’re creating an asymmetry in our favour – while we’re taking relatively limited efforts on our side, we’re increasing the attacker’s efforts by orders of magnitude. This kind of asymmetrical efforts is exactly what-good-protection should do <smile />.

Ever-Changing Binary Code

But wait – we’re still not done with the benefits of ithare::obf. On each subsequent release build of our Client, we should use different ITHARE_OBF_SEED.6

Using this technique, for each Client release, we’ll generate different (but still functionally equivalent) obfuscation code; this, in turn, will make a large chunk of attacker’s-efforts-coming-from-previous-build, useless.7

Let’s compare two decompiles of the exactly same source code, but with different ITHARE_OBFUSCATION_SEEDs (screenshot from https://www.diffnow.com/):

Difference between two obfuscations generated from exactly the same source code

As we can see, obfuscated code on both sides uses the same principles, but the code itself is very different; this, in turn, means that for each new Client build at least a significant portion of reverse engineering efforts should be repeated.

Moreover,

If we manage to release new versions of our Client faster than attacker can reverse engineer our automagically-generated stuff – bingo! We won!8

Once again, what we’re trying to do here – is to create an asymmetry, when something-which-comes-to-us-for-free (namely, automagically generated different obfuscation for each build) causes significant trouble to the attacker.


5 currently – only 64-bit seeds are supported, but in the future, I hope to extend seeds to 128-bit ones.
6 Ideally – we should use crypto-strength random number as our ITHARE_OBF_SEED.
7 Among other things, techniques such as F.L.I.R.T. and F.L.A.R.E. are rendered useless for the seriously-obfuscated functions.
8 Well, we’ll also have to integrate our obfuscations into our Client-Server protocol, which can be not-so-trivial (but still doable), more on it in [[TODO]] section below.

 

Caveats

Of course, as any tool of this kind, there are certainly caveats (in addition to any bugs-due-to-such-complicated-library-being-pre-alpha).

Still Security-by-Obscurity <sad-face />

First, most importantly, all-the-information-attacker-needs, is still in the code <sad-face />; what we’re doing, is merely hiding things, which cannot be really secure. This means that as-a-way-to-protect-your-intellectual-property (~=”your offline program from being pirated”), this method still isn’t likely to work9 (not to mention that motivation for most of such protections is controversial).

Still, in the context of MOGs, IMNSHO there is considerable value in hiding things in the manner described above (especially if such obfuscation is applied consistently across lots of the code). In particular, as we noted above, if attackers cannot cope with the pace of our new releases – it means that we won’t have problem with bot fighters/cheaters anymore. Sure, at some point they will invent tools-which-can-automagically-deobfuscate-our-stuff, but first, it will take time, and second – after they do it, we’ll be in position to readjust (enabling us to go into arms race with the attackers, which is already a tremendous improvement over the current state of affairs); moreover – my gut feeling is that in this battle we do have an upper hand (just because the attacker has to solve an inverse problem compared to us solving forward problem, and solving inverse problems tends to be much more complicated than solving forward problems).


9 OTOH, if speaking about programs which go online at least occasionally – this technique can be used to make hacks much more difficult (how – is beyond the scope of this book, but a requirement for a program to go online will allow…)

 

Performance

As for performance – of course, convoluted obfuscated binary code does take its toll; that’s exactly why all the obfuscation primitives are supposed to mention maximum-number-of-CPU-cycles-we-can-spare. And of course, for 3D vertex-level code using any obfuscation is a fallacy.

On the other hand – it is well-known that 5% of code take 95% of CPU time (and conversely, 95% of code take only 5% of CPU time); this tends to stand for games too (in particular, script engines tend to be much less optimized than 3D graphics). As a result, our solution is rather obvious: let’s obfuscate non-performance-critical code.

Moreover,

It is non-performance-critical-code which tends to be of the most interest to the attackers(!)

Indeed, trying to extract usable information (such as “opponent is at 11 o’clock”) from vertex-level manipulations is a gargantuan task; it is usually much easier to get the opponent’s coordinates right after the network-packet-with-these-coordinates is parsed; however – as this parsing doesn’t happen too often, it can be obfuscated without causing too much of the performance penalty.

Let’s make a quick calculation. Our non-obfuscated factorial() function (when called to calculate factorial(20)), takes about 10ns of time. Our seriously-obfuscated version shown above, takes about 200ns – which is whopping 20x longer. However, let’s not be hasty to write it off on this “20x slower” basis. If we’re using some-function-obfuscated-in-a-similar-manner to parse our network packets – we’ll have to call it only once per network tick, i.e. 20 times per second; under these conditions – adding 200ns per network tick will take only 4 microseconds out of each second, or mere 0.0004%. In other words –

we can easily use as many as 10000 such-obfuscations-as-factorial()-above-uses per each network packet – and performance penalty will still be unnoticeable.

Sure, we DO need to be very careful about performance – but the division between “high-performance code” and “code in need for obfuscation” is there, so it is possible to exploit it.

Open Source Obfuscation is NOT an Oxymoron

Hare wondering if you are crazy:how can we possibly both obfuscate things and make them 100% public at the same time?At the first glance, saying “open source” and “obfuscation” in the same breath sounds as an oxymoron; how can we possibly both obfuscate things and make them 100% public at the same time? However, as soon as we say that we’re using randomized code generation – it starts to make sense.

While ithare::obf itself is open-source (and therefore can be validated by lots of people(!)), ITHARE_OBF_SEED parameter which was used to compile each of the builds, is certainly intended to stay out of sight, so that the attacker won’t know what exact obfuscation was generated for the executable he’s dealing with. This, in turn, means that

It IS possible to have ALL the code open, just ITHARE_OBF_SEED private – and still enjoy some protection provided by obfuscation

In a sense – it is similar to having encryption algorithms completely public, with just their keys being secret; here – ithare::obf plus our obfuscated-source-code is our public algorithm, and ITHARE_OBF_SEED is our key.

In particular (as we’ll discuss in [[TODO]] section below) we can try to use similar obfuscation to obfuscate Client-2-Server protocols in a 100%-open-source game (just not publishing specific ITHARE_OBF_SEED used for building a specific Server+specific Clients). This, in turn, would allow anybody to compile and run their own Server+Client; it is just that ITHARE_OBF_SEED will be unknown for each such Server+Client pair – so while all such Servers+Client pairs are functionally the same (and play the same game) – protocols for each such pair will be different, with the automagically-generated obfuscation code providing non-trivial protection from cheaters and bots (and moreover, for each such Server+Client pair obfuscation will need to be broken separately).

In addition, even now ithare::obf has a modular structure, where it is rather easy to add your own primitives (and ithare::obf-with-your-own-primitives is no longer 100% open source, which may both provide additional protection and give you more peace of mind). In the future, I expect to provide an “official” way to expand ithare::obf with your-own-primitives without the need to modify its source code.

Summary on ithare::obf

To summarize our findings so far:

  • It is possible to have a code which is both rather readable in source, and pretty much unreadable in binary
    • As a side benefit – it is possible to re-generate our binary code on each build at virtually zero cost for us, (while making the life of attackers much more complicated)
  • ithare::obf library is currently immature (and certainly NOT production-ready), but it does provide proof-of-concept implementation for these things
    • Even current state-of-the-art IDA decompiler is no match for even-current rather-basic implementation of ithare::obf
  • It is still not a silver bullet – but can be used as a building block to build very serious protection from cheaters (in fact, most of anti-cheater protection techniques would benefit from this kind of randomized obfuscation code).
  • Hare with smiley sign:Performance-wise, we can do A LOT of obfuscation per network tickPerformance-wise, we can do A LOT of obfuscation per network tick
    • And apparently, it is rather-performance-non-critical stuff which is the easiest to attack, so it is exactly the thing which we should obfuscate (alongside with any anti-cheating routines we want to have)
  • Open-source obfuscation is possible (as long as we keep seed out of sight); in a sense – it is about the same as having crypto-algorithm public, while keeping its key secret (with our ithare::obf directly corresponding to the crypto-algorithm, and our seed corresponding to the key).

[[To Be Continued…

Tired hare:This concludes beta Chapter 29(h) from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”.

Stay tuned for Chapter 29(i), where we’ll discuss issues related to protocol obfuscation (including versioning)]]

 

Don't like this post? Comment↯ below. You do?! Please share: ...on LinkedIn...on Reddit...on Twitter...on Facebook

Acknowledgement

Cartoons by Sergey GordeevIRL from Gordeev Animation Graphics, Prague.

Join our mailing list:

Comments

  1. Thomas says

    Hey I just wanted to write you a comment letting you know there is in fact interest in this 🙂

    We’re planning on using this in a game that’s currently being fixed up by me and some members of the community. It’s not a big game at all, it’s an old free MMO that desperately needs something like this.

    Just wanted to let you know that, based on your github request to do so. I also might grab the book, too. Thanks for your work on this!

  2. Jaydan Cowell says

    Very interesting and highlighted some key topics in the fight against crackers; I wish this project the best, thanks for the informative article.

Leave a Reply to Helmut Mülner Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.