Bot Fighting 203: (Re)Actors and Short Stacks

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

[rabbit_ddmog vol=”8″ chap=”Chapter 29(k) from “beta” Volume VIII”]

By now, we already have a very solid foundation for our anti-bot efforts (IMNSHO, especially important is our automagical randomized obfuscation discussed in Bot Fighting 201: Declarative Data+Code Obfuscation with Build-Time Polymorphism in C++). Still, there is a yet-another thing which can be improved, and it is related to my favorite-for-a-long-while-topic <wink />: (Re)Actors!¹

There was a whole huuuge chapter on (Re)Actors in Vol. II, so I won’t repeat myself too much, and will provide just a quick recap:

(Re)Actor is a thing (“object” in OO-speak) which has its own state, and a react() function
react() function, not really surprisingly, reacts to incoming events – modifying (Re)Actor state if necessary.
All the processing within react() is serialized (~=”as if it happens in one single thread”).
- Most importantly – we DON’T need to think about thread sync while implementing our game/business logic within (Re)Actor.
- OTOH, there are certain tricks to “offload” some work to other (Re)Actors/threads, but thread sync within react() is still out of question.²
It is essentially the same as “ad-hoc Finite State Machine”, “event-driven program”, Akka’s Actor, Node.js node, etc. etc.

Now, let’s see how (Re)Actors can help us to hide-things-from-the-bot-writer-even-better <smile />.

¹ If you feel I am spending too much time speaking about (Re)Actors – you may have a point, but IMO they do provide all-those-benefits-I-am-discussing – and even more <wink />.

² CAS-size nano-(Re)Actors discussed in [Hare17] are not in our current scope.

Separating System Calls from Game Logic

One thing which tends to give up A LOT of information about our code, is system calls. And hiding system calls, while possible to some extent – doesn’t stand ground against a dedicated attacker.³ In other words –

We should assume that each our system call is known to the attacker <sad-face />.

What we can hide though, is the reason why we’re issuing certain system call (or more generally – we can hide the context of the system call). And (Re)Actors tend to help with it greatly. Let’s consider a typical MOG, which reads user inputs and communicates with the Server-Side.

In a naïve-from-bot-fighting-point-of-view implementation, we could simply poll the state of keyboard and mouse (using system calls) when we feel that we need it, then we’ll process the input, and then – as soon as we have all the information necessary – we’ll call socket functions to send player inputs to the Server-Side. With this approach, however, we’ll be giving away quite a bit of information to the attacker <sad-face />. More specifically, if our code looks as

//Listing 29.1
while(1) {
  do_something1();
  ReadKeyboard();
  do_something2();
  SendToSocket();
  do_something3();
}

– we’re giving away information where exactly in our code we’re going to process the inputs, and where we’re producing the outputs. This tends to simplify attacker’s job significantly.

Of course, whatever-we-do, all these things are still bound to happen; it is when and how they happen, which we can play with. If writing the same thing in anti-bot (Re)Actor-style, it would look along the following lines:

//Listing 29.2
while(1) {
  ReadKeyboard();
  
  //{ here force-inlined reactor.react() starts
  do_something1();//force-inlined
  do_something2();//force-inlined
  schedule_message_to_send_to_socket();
    //no system call here,
    //  so it is not immediately obvious
  do_something3();//force-inlined
  //} here force-inlined reactor.react() ends
  
  read_scheduled_message();//force-inlined for a good measure
  SendToSocket();
}

There are two significant problems which this approach creates to the attacker. First of all, we’re no longer revealing any information about what-is-going-on-around-those-easily-visible-system-calls. Indeed – with the code shown on Listing 29.2, our reactor.react() becomes a monolithic purely-moving-bits-around function, with no easy way to split it into different parts. And any time we can say something becomes monolithic – it is an inherently Good Thing™ for our fight against bot writers.

Making sure that the react() function merely moves bits around, is going to be rather unusual for non-(Re)Actor developers. On the other hand, this whole subject happens to be extremely well-aligned with a discussion on deterministic (Re)Actors in Vol. II’s chapter on (Re)Actors (and deterministic (Re)Actors happen to be a Really Good Thing™ for many non-bot-fighting reasons – from an ability to debug a crash which happened in production(!), to replay-based regression testing etc.).

More formally – if we achieve determinism for our (Re)Actors, using any of the techniques discussed in Vol.II, except for “Call Wrapping” technique – our code will look pretty much like Listing 29.2 to the attacker.

“the only case when Call Wrapping is pretty much unavoidable, is related to measuring-time-spent-within-our-react()-call Now, if we take a look at the table of “how-to-achieve-determinism-for-different-system-calls” in Vol.II, we’ll see that the only case when Call Wrapping is pretty much unavoidable, is related to measuring-time-spent-within-our-react()-call (time in a usual sense, which translates into “time-when-the-input-was-received” or “time-when-we-started-processing-of-the-input”, is not a problem). TBH, it is not the thing which we need too often in a usual game/business program – and even if it does, it is not going to tell the attacker too much. And for any other purpose – we can avoid Call Wrapping if we want to (and, as mentioned above, determinism-without-Call-Wrapping is going to keep our code along the lines of Listing 29.2).

As a result, my recommendation here is very clear (and goes perfectly in line with recommendations from Vol. II):

Use (Re)Actors from the very beginning – at least for your Game Logic/Business Logic.
Make your (Re)Actors deterministic⁴ before going into beta – it will allow to debug very nasty things much more easily (also make sure to take a look at [Aldridge11], where they were using the same approach to optimize traffic under real-world conditions for Halo:Reach).
After you’re launched and got your 10K+ players – make sure to start migration from Call-Wrapping to other methods-to-ensure-determinism. This will help your bot fighting efforts quite a bit – without changing your already-deterministic code too much.

³ While we can try to hide system calls from static analysis, when being run under debugger, they are still easily visible <sad-face />.

⁴ More formally – “self-executable deterministic”, see Vol. II for a relevant discussion.

Short-Stacking the Attacker⁵

The second problem which the code on Listing 29.2 is going to cause attackers, is related to call stack. In practice, reverse-engineers-around-the-world are heavily relying on the call stack (or, more precisely – on automated analysis of stack frames and function call parameters). And given that we’re using a standard compiler, hiding current stack frames completely is pretty difficult; after all, both stack frames and calling conventions are defined in the respective ABI (and even if they wouldn’t – reverse-engineering them once for each compiler is not a rocket science).

“there will be almost-zero information available via reading the stack frames.However, with the code shown in the Listing 29.2, the call stack will be very short – and moreover, there will be almost-zero information available via reading the stack frames. Indeed – as long as we’re within the force-inlined react() (and assuming that all the do_somethingN() functions are force-inlined too) – we will be at exactly the same stack frame (located just 2 or 3 levels from system-level thread-creating function), and the stack frame won’t change as the attacker goes through our monolithic react() function.⁶

In practice, we’re likely to have some of the function calls within our react() function non-inlined (though as of 2017, having force-inlined functions as large as 20K in size is perfectly possible). However, as long as each of our functions is a 20K-byte monolith, it is not going to provide too much improvement for the attacker –

As long as we’re seriously obfuscating the interfaces⁷ of such non-inlined functions

(preferably – using randomized techniques discussed in [[TODO]] section above).

Overall, dealing with 20K-byte monster functions which you have absolutely no idea about – and which parameters and returns are severely obfuscated – is very high on the list of the worst-nightmares-even-for-a-serious-for-bot-writer.

Last but not least, let’s note that

We should obfuscate all the (Re)Actor parameters very thoroughly.

For this purpose – we can easily use about-the-same techniques which were discussed in [[TODO]] section above with regards to protocol obfuscation (though normally versioning won’t be required).

⁵ Has nothing to do with poker, nor with pancakes.

⁶ Of course, CPU stack will change (as compiler will use it to push/pop temporaries), but as these pushes/pops are completely unstructured – there will be no easy way to extract information-about-our-program-structure from them.

⁷ =”parameters and returns”

Confusing Things Further: Obfuscate-Only (Re)Actors

You don’t have to run faster than the bear to get away.

You just have to run faster than the guy next to you.

— Jim Butcher —

As soon as we use the tricks discussed above, we’re already pretty well positioned against the-guy-next-to-us.⁸However, we don’t need to stop at this point. Let’s take a look at the classical (Re)Actor-based model discussed in Vol. II:

Here, everything is rather clear: we have a Game Logic Thread running its own (Re)Actor, an Animation&Rendering Thread running its own (Re)Actor, and Communications Thread running its own (Re)Actor. In practice, it is Game Logic and Communication (Re)Actors which are usually most vulnerable.⁹ With this in mind, let’s see what we can do to obfuscate it further: let’s add three additional threads+(Re)Actors, which do nothing but obfuscate things:

These three new (Re)Actors do nothing – except for de-obfuscating incoming messages (using source-(Re)Actor-to-obfuscating-(Re)Actor obfuscation), and re-obfuscating them using the obfuscating-(Re)Actor-to-target-(Re)Actor obfuscation.

Now, the whole thing became a bit less obvious to the attacker – but we still can do more than that <smile />.

⁸ Rough translation: “we’re already better-protected than other games, which means that those-attackers-looking-to-make-money-out-of-it are rather likely to pass on us”. For a discussion on economy of cheating, see Vol. I’s chapter on Cheating.

⁹ If it so happens that for your game, it is Animation&Rendering (most likely, only a part of it which is closest-to-the-Game-Logic) which is likely to be attacked – the same thing can be done for it too.

Obfuscating Queues

“we can remove usually-naturally-occurring guarantee that the-same-queue-is-used-for-communications between specific two (Re)ActorsAs a next step – we can obfuscate not only the data which go in the queues (this we have already done), but also the queues themselves. In other words – we can remove usually-naturally-occurring guarantee that the-same-queue-is-used-for-communications between specific two (Re)Actors. This can be implemented as follows:

We can have a pool of the queues.
At start, a number of queues from the pool get randomly assigned to specific threads.
At random intervals, we can switch between different queues as follows:
- Receiver decides to change the queue, obtains “new queue to be switched to” from the pool, and starts to listen to messages from both queues ¹⁰
- Receiver sends a message to all its senders that they should start sending their messages addressed to it, to the new queue; these messages can go either directly, or indirectly (and timing of this message is not really important).
- On receipt of this message – each sender starts to use the new queue, and sends a confirmation message back to the receiver.
- As soon as receiver gets all the confirmations messages – it can return old queue back to the pool (where it can be picked up by a different thread).

As a result, if the attacker monitors one specific physical queue (which queues are pretty difficult to hide) in an attempt to intercept all the messages between specific two parties – he is going to be quite surprised <evil-grin />.

¹⁰ Depending on technology in use, it can be rather tricky, but is always doable (in extreme cases – via creating an additional thread).

Tor-Like Network of (Re)Actors

Tor Tor is free software for enabling anonymous communication. The name is derived from an acronym for the original software project name 'The Onion Router'— Wikipedia — But even this is not necessarily the end of our obfuscation efforts. To make things even worse for the attacker, we can make a Tor-like network of (Re)Actors which play ping-pong with our messages on the way between parties, before the messages get to their destinations. Very briefly, Tor-like network can be built from our (Re)Actors as follows:

In addition to our three real (Re)Actors, we have a dozen of purely-obfuscating (Re)Actors.
Source (Re)Actor, before sending a message to the destination, decides on the route which will be used for the message to go through – and uses a cascade of lightweight obfuscations for each of the intermediate nodes (effectively implementing “onion routing” characteristic for Tor network)
- NB: we have to make sure that real (Re)Actors can work as intermediate nodes too.
Each of intermediate obfuscation-only (Re)Actors, on receiving such a message, de-obfuscates it – and forwards to the next hop in the route (not forgetting to obfuscate it to-be-sent-over-the-queue <evil-grin />).

This way (just like with Tor network), it will become very difficult to track how messages are exchanged. Moreover, if we’re careful enough –

We can make it very difficult to distinguish between Game Logic (Re)Actor and obfuscation-only (Re)Actor.

Even better – nothing prevents us from moving socket access from Communications (Re)Actor to a new Socket-Only (Re)Actor – which, in turn, will make Communications (Re)Actor very-difficult-to-distinguish from obfuscation-only (Re)Actors.

To be honest, this Tor-like obfuscation network has never been implemented (yet) – but I certainly see a LOT of potential in it <wide-smile />.

How Far Do You Wanna Go?

With all those options on the table, a million-dollar question is the following: “how far do we want to go down the road of obfuscation?” My current personal take on it goes as follows:

If your game is a fast-paced shooter – probably going beyond Short-Stacking Attacker won’t be necessary. However, if you’re afraid of bots (and for any game with more than a few thousands of simultaneous players you SHOULD be <sad-face />) I still suggest to obfuscate the following:
- Communications (Re)Actor – both internals and inputs/outputs
- Game Logic (Re)Actor – both internals and inputs/outputs
- Animations&Rendering (Re)Actor – only those parts which are facing Game Logic (they are usually both not-so-performance-critical, and the-most-vulnerable).
  - Indeed, as soon as we’re done with the data in the form of “we have character C at position (X,Y)” and the processing has moved into vertexes and coordinates – it becomes extremely difficult to make sense out of it (and in real-time too!).
- This way, we’ll get a reasonably-obfuscated game, with the whole critical path of the vulnerable data being covered.
“If your game is NOT really time-critical – we can go all the way to the full-scale Tor-like network out of obfuscation-only (Re)ActorsIf your game is NOT really time-critical (which roughly starts from medium-paced poker games and goes all the way to banking apps) – we can go all the way to the full-scale Tor-like network out of obfuscation-only (Re)Actors (and obfuscating all the (Re)Actors to the maximum-extent-possible).
- For slower-paced games with not-so-much data, critical path tends to cover pretty much the whole game – and covering all the game is what we’re effectively achieving here <smile />.

[[To Be Continued…

This concludes beta Chapter 29(k) from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”.

Stay tuned for Chapter 29(l), where we’ll briefly discuss antivirus-style kind of protections against known bots]]

[+]References

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.