Bot Fighting 201. Part 4. Obfuscating Protocols. Versioning.

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

[rabbit_ddmog vol=”8″ chap=”Chapter 29(i) from “beta” Volume VIII”]

By now, we have seen how code, plus data which goes between different pieces of code, can be obfuscated quite efficiently (and without too much effort from app-level side). Now, we can observe that actually, data-which-goes-between-different-pieces-of-code, is not necessarily confined to one single process; moreover – it can go between Client and Server, so we can effectively obfuscate our protocols using about-the-same approaches.[[TODO: elaborate on it; also see reply to Dahrkael below]]

Issues and Solutions

Probably the most important issue with obfuscating protocols (the one which we did not encounter before) is the issue of versioning. When releasing a new build with a factorial() function obfuscated by ithare::obf, the-way-factorial()-itself-deobfuscates-its-input-parameters, and the-way-calls-to-factorial()-obfuscate-their-input-parameters, happen to stay consistent just because all our changes-due-to-different-seed are local to the executable being rebuilt, so the compiler will automagically update all the places where our obfuscated types are used.

“we’ll probably have a business/GDD-level requirement of supporting at least one version beyond the current oneIn contrast, when obfuscating over-the-network protocols, we don’t have such luxury of compiler doing the version matching for us; in turn, it means that we have to guarantee the correct matching of the obfuscation on different sides of communication, ourselves. In addition, we’ll probably have a business/GDD-level requirement of supporting at least one version beyond the current one (usually – more like “we have to support most-recent-versions-going-back-for-a-month-or-so”¹).

To solve the first issue (guarantee correct matching), we can do the following:

Specify obfuscation IDs for obfuscation-which-affects-over-the-wire-protocol manually, i.e. NOT using ever-changing automagical attributes such as __LINE__ and __COUNTER__.²
- Moreover, exactly the same obfuscation IDs MUST be specified for both Client and Server
  - In practice, I suggest to have a build script, which (a) generates crypto-quality random ID, and (b) uses it to build both Client and Server ³
- Also, our obfuscation library should generate exactly the same code for different platforms.⁴

In addition, while it is not strictly required, I strongly suggest to incorporate obfuscation/deobfuscation right into your IDL compiler (which, as it was discussed in Vol. I’s chapter on Communications, and Vol. IV’s chapter on Protocols&Marshalling, you SHOULD have regardless of obfuscation). This way, we can implement our protocol obfuscation almost for free for your app-level (the “almost” part is because you’ll still need to care about matching version of the outgoing data with the version of incoming one – more on it below). Still, even with IDL compiler, we HAVE to ensure that obfuscation IDs for the generated protocol are the same (along the same lines as mentioned above).

This, strictly speaking, is sufficient to ensure compatibility of one single version of the Client with one single version of the Server. However, to ensure backward compatibility so that Server can support both current Client and one⁵-version-behind-current-Client, we have to go a bit further:

For each of the obfuscated versions of the protocol, we have to assign its own VersionID (be careful NOT to use the ITHARE_OBF_SEED itself as VersionID, though crypto-hash of ITHARE_OBF_SEED should be ok).
Then, Client should include VersionID in the communications.
- For TCP-based communications, it is sufficient to send VersionID only in the beginning of each TCP stream.
- For UDP-based communications, we could include VersionID into each packet, but if we (as it was discussed in the Vol. IV’s chapter on Marshalling and Protocols) are going to generate some kind of session key – it is sufficient to use Version ID only while we’re negotiating the key (we’ll still obfuscate all the communication after the key exchange, but VersionID won’t need to be explicit in each packet, as it can be derived from session ID which is usually present in such protocols).⁶
When compiling our Server, we should include ALL the obfuscations for the protocols of ALL the Clients-we-want-to-support.
Then, on receiving any piece of data from Client, Server always knows which-Version-ID-to-apply, and can perform all the necessary obfuscation/deobfuscation (as well as marshalling/unmarshalling) correctly.
- One thing to remember about in this regard, is that Client expects the data using exactly the same VersionID which it uses to send the data; in other words – our Server should send outgoing data using the same VersionID as found in its incoming data. In general, supporting this concept is rather straightforward, but depending on your code, can be rather tedious <sad-face />.

“we can handle several Client versions (each with its own obfuscation) with the very same Server.Bingo! We’ve got our obfuscated protocol – and with proper versioning too, so we can handle several Client versions (each with its own obfuscation) with the very same Server.

¹ BTW, the worst real-world case of not-being-able-to-update-Client-as-fast-as-we’d-like-to, is Apple App Store, but even there, 99% of the time your update will get approved within 2-3 weeks maximum, so even with App Store in mind, we’re speaking about backward compatibility going back 1-2 months at most.

² Note that the-code-which-actually-marshals-the-data but doesn’t affect protocol, still MAY use OBF() macros based on __LINE__ and __COUNTER__.

³ Of course, if your Client and Server platforms are different, you’ll have to pass this random ID from one build box to another one, so it won’t be as simple as one single script – but is still possible.

⁴ For current ithare::obf, it is not the case, but I plan to ensure it in the future (see [Issue3]).

⁵ two, three, etc.

⁶ Note that if we want to protect our own protocol from security vulnerabilities (BTW, this kind of obfuscation can provide certain protection from zero-day attacks, such as “zero-day attack on TLS”) – we’d still have to include VersionID into each UDP packet (a brief overview of it is provided in [[TODO]] section below), but at this point we’re solving a different problem, where it is not necessary.

Benefits

If we manage to run our system with obfuscated protocols without causing trouble, we obtain the following benefits:

Even if the attacker has 100% of our source code (but doesn’t have ITHARE_OBF_SEED) – he cannot attack us until he deobfuscates the protocol.
- We have to beware of providing a non-obfuscated API which can be used to generate obfuscated messages from deobfuscated data; for example, if attacker can call a non-obfuscated function SendPaymentRequest() from inside of our executable – no amount of protocol obfuscation will really help us <sad-face />
  - In other words – we SHOULD obfuscate not just protocols, but all the critical paths within our Client.⁷
- “we’ll be able to change the obfuscated stuff at near-zero cost to us (and at a huge cost to the attacker).If we manage to pull off this Holy Grail of obfuscation⁸ – we’ll be able to change the obfuscated stuff at near-zero cost to us (and at a huge cost to the attacker). This constitutes a very significant change to the whole economy of hacking, and will provide very significant protection for the time being.⁹
- Hackers will have significant difficulties sharing their findings-about-our-program beyond our single release.
- We can provide different versions to different players; this, in turn, can allow us to:
  - Reduce the chances of collaboration between different hacking teams even within one single release
  - If we write down versionID-sent-to-each-player – we can be 99.(9)% sure that the executable-player-uses, is the one we sent him. In other words, while we still cannot identify player’s device – we become able to identify our own executable.¹⁰ While certainly being not as good as ability to identify device – it can come handy in certain forensic-like scenarios.¹¹
- If we see that some versions (such as older-versions) get hacked more than the others – we can treat them as suspicious (raising some kind of “red flag”, which may cause things such as more captchas, more elaborated AV-like code sent to them, etc.)

⁷ With “critical path” roughly defined as “a data path where the data is (a) simple enough to comprehend, and (b) is of potential interest to the hacker”. [[TODO: elaborate; consider adding GUI to the critical path (with implementation of GUI randomization along the lines of [https://link.springer.com/chapter/10.1007/978-3-319-07536-5_30])

⁸ With all the critical paths sufficiently obfuscated.

⁹ at least until/unless AV folks will develop tools to break obfuscations automagically, but then we’ll see if we can deal with these tools. It is sad that we have to play against the people from AV industry, but on obfuscation front our legitimate interests as MOG developers sadly go against the interests of antiviruses.

¹⁰ formally, we can do it without obfuscation too, but then it will become rather easy to erase this information.

¹¹ for example, if player A has logged in from a Client which was distributed to player B – we can say that they’re somehow related.

Protocol Obfuscation and Zero-Day Attacks

One all-important property of protocol obfuscation is that

In certain cases, protocol obfuscation can allow to protect from zero-day attacks

One such scenario occurs whenever:

(a) we’re using our own app to communicate with our server. Our own app means that we do NOT allow direct access by 3rd-party apps such as browsers (though we may have an in-browser app compiled with emscripten).
(b) one of the obfuscations we’re using, resides on top of TLS.

Then, if/when a zero-day bug is encountered in TLS¹² – our obfuscation does provide additional protection even before the attacker can reach the code with that zero-day vulnerability

In particular (very briefly, for more detailed discussion see [[TODO: ref]]):

Economy of hack almost-invariably includes two very separate categories of hackers: (a) people who find an attack, and (b) people who buy the attack and mount it.
- For our obfuscated app, most of the hackers from (a) category won’t bother with breaking our obfuscation (there is too little glory in it), and most of the hackers from (b) category won’t have enough skills/time/…
““You don’t have to run faster than the bear to get away. You just have to run faster than the guy next to you.” - Jim ButcherWhile we’re one of the few apps using obfuscation – it doesn’t make economical sense to spend time on hacking us using this zero-day attack (there are soooo many juicy-and-easily-hackable-targets besides us). As Jim Butcher has said, “You don’t have to run faster than the bear to get away. You just have to run faster than the guy next to you.” – and it applies to businesses-running-away-from-hackers, in spades.
But even if everybody starts to obfuscate their protocols – overall resistance to zero-day attacks will be still significantly-better-than-now. Very briefly – if everybody obfuscates their protocols in unique manner – it means that zero-day attack as such effectively ceases to be “class break” as defined in [Schneier]. Indeed, even with zero-day attack on TLS in their hands, attackers still cannot phish around for vulnerable obfuscated servers; instead – they need to break obfuscation for each and every server-using-obfuscation, individually.

Of course, this approach has nothing to do with silver bullets (in particular, it won’t fly by definition for public protocols), but given the effect of modern zero-day attacks – erecting another (and individual!) barrier at least for Client-Side apps (especially as they can include in-browser apps via emscripten), certainly qualifies as a Good Thing(tm).

¹² such as Heartbleed

Ithare::obf and protocol obfuscation

I am planning to extend ithare::obf to protocol obfuscation too; however – it is going to take some time. If you’re interested in ithare::obf one way or another (for intra-exectutable obfuscations, or for protocol obfuscations) – please comment below so I have a bit more reason to work on it harder <smile /> – it is always encouraging to know if people are going to use whatever-I-am-writing…

[[To Be Continued…

This concludes beta Chapter 29(i) from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”.

Stay tuned for Chapter 29(j), where we’ll start discussing uses for timing in our anti-cheating efforts]]

[+]References

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.

"No Bugs" Hare says

January 27, 2018 at 6:05 am

IMO – not really. “Infinite” would be a problem, but as long as it is just 8 builds, I don’t see TOO much hassle. Of course, there ARE complexities and there ARE risks (in particular, to make a silly mistake when implementing this inherently-high-risk-thing), but I don’t really feel 8 builds should be a deal-breaker.

On the other hand, with such large time spans in mind, I’d suggest to think about the potential for the older protocol versions to be hacked – so probably I’d integrate a way to disable certain older protocols right in runtime and without restart; if you realize that you’re under an attack (security attack, bot attack, whatever-else-attack) which is coming only via much-older-protocols – it is usually better to drop players with these much-older-Clients than risk the whole population.

In an extreme case – you can even support those much-older-protocols only on separate “old-protocol-conversion” Servers (having your IDL compiler to generate protocol-conversions for older protocols into current-protocols, and run those protocol-conversions on “old-protocol-conversion” Servers, placed into some kind of DMZ zone); I am not sure whether this particular feature is worth the trouble in your case, but if so – it can raise the bar for the attackers even further (in particular, if you have your TLS obfuscated, and they got a zero-day attack on TLS, _and_ managed to hack your older-obfuscation – they will still be able only to hack into “old-protocol-conversion” Server – but as this Server is only DMZ, it still far away from the whole system being compromised).

On the third hand ;-), as a side benefit of having-to-support-older-protocols – you can use those-MUCH-older-protocols as an automated “red flag” (especially so if it is a voluntary decision of the player, and not a platform restriction) – activating all the usual anti-bot actions associated with “red flags”. ~=”if somebody elects to run a MUCH older protocol on Windows – they should have reasons to do it, and these reasons can be either legitimate, or not-so-legitimate”.

Comments

Dahrkael says

January 16, 2018 at 3:22 pm

The next chapter will discuss the actual protocol obfuscation?
Because i’m not entirely sure how your executable obfuscation techniques correlates with protocol ones.

you can be sure your library will be used on some huge projects if it gets production ready 😉

- "No Bugs" Hare says
  
  January 17, 2018 at 5:57 am
  
  > The next chapter will discuss the actual protocol obfuscation?
  > Because i’m not entirely sure how your executable obfuscation techniques correlates > with protocol ones.
  
  I re-read it and I agree that it is underexplained, so I added a [[TODO]] about it, THANKS! But in fact, it is pretty straightforward: as whatever-we-do-in-ithare::obf, obfuscates _both_ code _and_ data, it means that the _data_ obfuscation part will also work over the network. Or in more detail: ithare::obf relies on the notion of “Injection”; it is these Injections which are randomly (and recursively) generated, and do 99% of the job in ithare::obf. Now, we can apply the very same randomly-generated Injections to our data-being-marshalled (of after-the-data-has-already-been-marshalled). This way, we’ll perform some randomly-generated-injection (which, starting from certain number of cycles and unless you know the algorithm, is statistically indistinguishable from crypto) over the data before sending it out, and corresponding surjection on the data after it is received; that’s pretty much it :-). (NB: of course, we have to do real crypto too – in addition to any obfuscation we’re speaking about; I know that you know it, but want to be very clear for everybody-else :-)).
  
  EDIT: also, for larger messages, we’ll have to make some kind of PRNG-defined-by-code; such PRNG can be built, for example, using (some_crypto + random-Injection-over-crypto-block) as a crypto in CTR mode (i.e. basically encrypting incremented COUNTER using (some_crypto-using-hardcoded-key + random-Injection-over-crypto-block); if our Injection is good enough, we can even drop crypto, but unless some deeper analysis is done, it is a bit risky. Another implementation of PRNG could use, say, SHA256(random-Injection(COUNTER)) as our PRNG.
  
Ling Zhao says

January 16, 2018 at 4:04 pm

Sergey, one issue with protocol/critical path obfuscation is that it does not protect from hacking through GUI (either native client or browser). Unfortunately it has become more prevalent as it requires much less sophisticated knowledge to carry out, for example, to do credential stuffing attack. Another way of obfuscation is to present to each end user a different UI every time (see https://link.springer.com/chapter/10.1007/978-3-319-07536-5_30).

- "No Bugs" Hare says
  
  January 17, 2018 at 6:25 am
  
  > it does not protect from hacking through GUI (either native client or browser).
  
  Sure, it doesn’t. However, if we can restrict bot writers to GUI-level hacking, our job is already 90%-done. Protocol-level obfuscation does protect from a really wide range of attacks – ranging from “zero-day attack on the TLS”, via “forgotten parameter check on the Server-Side while Client never sends it”, all the way to “read all the data for the bot right after unmarshalling” (which allows to run GUI-less bots, and in case of proxy GUI-less bots, they can be made fundamentally undetectable).
  
  > Another way of obfuscation is to present to each end user a different UI every time (see https://link.springer.com/chapter/10.1007/978-3-319-07536-5_30).
  
  THANKS for the link (I didn’t know about it before); that being said, I know of at least one company 😉 which does use something along these lines (moving some GUI elements by a few pixels) for at least 10 years before the article was published ;-).
  
  In any case, GUI randomization is a strict _complement_ to any other obfuscation efforts discussed above (if we randomize GUI but keep internals unobfuscated – they will get us at protocol level – with a library published to get the data and allow writing bots).
  
Ling Zhao says

January 26, 2018 at 9:52 pm

One more question, supposing in the worst case we do need to maintain backward compatibility for 8 builds in over 10-month span, will this requirement significantly increase the complexity of the solution in terms of implementation and operational risks? Thanks.

- "No Bugs" Hare says
  
  January 27, 2018 at 6:05 am
  
  IMO – not really. “Infinite” would be a problem, but as long as it is just 8 builds, I don’t see TOO much hassle. Of course, there ARE complexities and there ARE risks (in particular, to make a silly mistake when implementing this inherently-high-risk-thing), but I don’t really feel 8 builds should be a deal-breaker.
  
  On the other hand, with such large time spans in mind, I’d suggest to think about the potential for the older protocol versions to be hacked – so probably I’d integrate a way to disable certain older protocols right in runtime and without restart; if you realize that you’re under an attack (security attack, bot attack, whatever-else-attack) which is coming only via much-older-protocols – it is usually better to drop players with these much-older-Clients than risk the whole population.
  
  In an extreme case – you can even support those much-older-protocols only on separate “old-protocol-conversion” Servers (having your IDL compiler to generate protocol-conversions for older protocols into current-protocols, and run those protocol-conversions on “old-protocol-conversion” Servers, placed into some kind of DMZ zone); I am not sure whether this particular feature is worth the trouble in your case, but if so – it can raise the bar for the attackers even further (in particular, if you have your TLS obfuscated, and they got a zero-day attack on TLS, _and_ managed to hack your older-obfuscation – they will still be able only to hack into “old-protocol-conversion” Server – but as this Server is only DMZ, it still far away from the whole system being compromised).
  
  On the third hand ;-), as a side benefit of having-to-support-older-protocols – you can use those-MUCH-older-protocols as an automated “red flag” (especially so if it is a voluntary decision of the player, and not a platform restriction) – activating all the usual anti-bot actions associated with “red flags”. ~=”if somebody elects to run a MUCH older protocol on Windows – they should have reasons to do it, and these reasons can be either legitimate, or not-so-legitimate”.