Password Hashing: Why and How

 
Author:  Follow: TwitterFacebook
Job Title:Sarcastic Architect
Hobbies:Thinking Aloud, Arguing with Managers, Annoying HRs,
Calling a Spade a Spade, Keeping Tongue in Cheek
 
 

[[About Vol.2 of the upcoming “Development and Deployment of MMOG” book. There is no need to worry, I just need some time to prepare for publishing of Vol.1. “beta” chapters of Vol.2 are planned to start appearing in 3 weeks from now. Stay tuned!]]

Password hashing is a non-trivial topic, which has recently become quite popular. While it is certainly not the only thing which you need to do make your network app secure, it is one of those security measures every security-conscious developer should implement. In this article, we’ll discuss what it is all about, why hash functions need to be slow, and how password hashing needs to be implemented in your applications.

Salted Password Hashes

What is it all about?

Hare with hopeless face:For password hashing, the answer is very unpleasant: we’re trying to mitigate the consequences arising from stealing the whole of your site’s password database.Whenever we’re speaking about security, there is always the question: what exactly is the threat we’re trying to protect ourselves from? For password hashing, the answer is very unpleasant: we’re trying to mitigate the consequences arising from stealing the whole of your site’s password database. This is usually accompanied by the potential for stealing pretty much any other data in your database, and represents the Ultimate Nightmare of any real-world security person.

Some (including myself) will argue that such mitigation is akin to locking the stable door after the horse has bolted, and that security efforts should be directed towards preventing the database-stealing from happening in the first place. While I certainly agree with this line of argument, on the other hand implementing password hashing is so simple and takes so little time (that is, if you designed for it from the very beginning) that it is simply imprudent not to implement it. Not to mention that if you’re not doing password hashing, everybody (your boss and any code reviewers/auditors included) will say, “Oh, you don’t do password hashing, which is The Second Most Important Security Feature In The Universe (after encryption, of course).”

The most important thing, however, is not to forget about a dozen other security-related features which also need to be implemented (such as TLS encryption, not allowing passwords which are listed in well-known password dictionaries, limits on login rate, etc. etc. – see ‘Bottom Line’ section below for some of these).

Attack on non-hashed passwords

Hare thumb down:If your DB stores your passwords in plain text, then the game is over – all the passwords are already available to the attacker, so he can impersonate each and every userSo, we’re considering the scenario where the attacker has got your password database (DB). What can he do with it? In fact, all the relevant attacks (at least those I know about) are related to recovering a user’s password, which allows impersonation of the user. Subtle variations of the attack include such things as being able to recover any password (phishing for passwords), or to being able to recover one specific password (for example, an admin’s password).

If your DB stores your passwords in plain text, then the game is over – all the passwords are already available to the attacker, so he can impersonate each and every user. Pretty bad.

Attempt #1: Simple hashing

You may say, “Hey, let’s hash it with SHA256 (SHA-3, whatever-else secure hash algorithm), and the problem is gone!”, and you will be on the way to solving the problem. However, it is more complicated than that.

Let’s consider in more detail the scenario when you’re storing passwords in a form of

P’=SHA256(P) (*)

where P is user-password, and P’ is password-stored-in-the-database.

Dictionary attacks

So, an attacker has got your password DB, where all the passwords are simply hashed with SHA256 (or any other fast hash function), as described above in formula (*). What can he do with the database?

First of all, he can try to get a dictionary of popular passwords, and – for each such dictionary password – to hash it with SHA256 and to try matching it with all the records in your database (this is known as ‘dictionary attack’). Note that using simple P’=SHA256(P) means that the same P will have the same P‘, i.e. that the same passwords will stay the same after hashing.

Hare pointing out:While this kind of attack is certainly much more difficult than just taking an unhashed password, and therefore simple hashing is clearly better than not hashing passwords at all, there is still a lot of room for improvement.This observation allows the attacker to pre-calculate SHA256 hashes for all the popular passwords once, and then compare them to all the records in your DB (or any other DB which uses the same simple hashing). While this kind of attack is certainly much more difficult than just taking an unhashed password, and therefore simple hashing is clearly better than not hashing passwords at all, there is still a lot of room for improvement.

Attempt #2: Salted hash

To deal with the issue when the hash is the same for the same password (which allows the pre-calculation for a dictionary attack, and also allows some other pre-calculation attacks, briefly mentioned below), so-called ‘salted hashes’ are commonly used.

The idea is the following: in addition to P’ (user password – password-stored-in-the-database), for each user we’re storing S – so-called ‘salt’. Whenever we need to store the password, we calculate

P’=SHA256(S||P)

where || denotes concatenation (as string/data block concatenation).

As long as S is different for each user, the hashes will be different for different users, even if their passwords are exactly the same. This, in turn, means that pre-calculation for a dictionary attack won’t work, making the life of an attacker significantly more complicated (at virtually no cost to us). In fact, ‘salted hashes’ as described above, defeat quite a few other flavours of pre-calculation attacks, including so-called ‘rainbow table’ attacks.

The next practical question is what to use as salt. First of all, S must be unique for each user. Being statistically unique (i.e. having collisions between salts for different users very unlikely) is also acceptable; it means that if you’re using random salts of sufficient length, you’re not required to check the salt for uniqueness.

Traditionally (see, for example, [CrackStation]), it is recommended that S should be a crypto-quality random number of at least 128 bit length, usually stored alongside the password in the user DB. And crypto-quality random numbers can be easily obtained from /dev/urandom on most Linux systems, and by using something like CryptGenRandom() on Windows.

Hare wondering if you are crazy:Note that none of the C++11 random number engines (LCG, Mersenne-Twister, or Lagged Fibonacci) can be considered good enough for cryptographic purposes – in short, they’re way too predictable and can be broken by a determined attacker, given enough output has leaked.Note that, despite a very confusing wording on this subject in [CppReference], none of the C++11 random number engines (LCG, Mersenne-Twister, or Lagged Fibonacci) can be considered good enough for cryptographic purposes – in short, they’re way too predictable and can be broken by a determined attacker, given enough output has leaked.1 Overall, random number generation for security purposes is a very complicated subject, and goes far beyond the scope of this article, but currently the safest bet in this regard is to use Schneier and Ferguson’s [Fortuna] generator with OS events fed to it as entropy input (this is what is implemented in most Linuxes for /dev/urandom), or (if you do not have the luxury of having entropy on the go, but can get a cryptographic-quality seed), the so-called Blum-Blum-Shub generator.

However, some people (such as [Contini]) argue that using a concatenation of a supposedly unique site-ID with a unique user-ID (which is already available within the database) as S, is about as good as using crypto-random S. While I tend to agree with Contini’s arguments in this regard, I still prefer to play it safe and use crypto-random S as long as it is easy to do so. At the very least, the ‘playing it safe’ approach will save you time if/when you need to go through a security review/audit, because you won’t need to argue about using not-so-standard stuff (which is always a pain in the neck).

So, the suggested solution with regard to server-side salting is the following:

  • to store S for each user in the same DB (there is no need to encrypt S, it can be stored as a plain text)
  • whenever a password needs to be stored for user P, S (of at least 128-bit length) is taken from /dev/urandom2 or from CryptGenRandom() 
  • store a (S,P’) pair for each user, calculated as P’=SHA256(S||P), where || denotes concatenation. P must never be stored in the DB.

As discussed above, this approach is much more solid than simple hashing, but… there is still a caveat.


1 Within C++11, it is std::random_device which is intended to work as a non-deterministic source of randomicity. However, as of beginning of 2016, there is no way to find out whether your-std-library implementation of std::random_device is good enough (std::random_device::entropy, which is intended for this purpose, returns hardcoded 0 at least under gcc and clang [CppReference.entropy]). Moreover, existing implementations of std::random_device vary in quality greatly, so I do NOT recommend risking to rely on them. In short – play it safe and use /dev/urandom or CryptGenRandom() directly.
2 Strictly speaking, you need to double-check the documentation of your target distribution to be sure that /dev/urandom generates crypto-quality numbers (or uses [Fortuna]), which is common, but not guaranteed. However, I would argue that for the purposes of generating salt S such double-checking is not 100% required.

 

Prohibition on passwords in known dictionaries

Some may ask: “Hey, why bother with salting if we can simply prohibit users from using passwords from known dictionaries?” The answer to this question is the following:

You do need both to prohibit passwords from known dictionaries and to use salt as described above.

Prohibiting dictionary-passwords is necessary even if passwords are ‘salted’, because dictionary attack is still possible; if the attacker looks for one single password, he can still run the whole dictionary against this specific password, whether it is salted or not (what the salt does is increase many-fold the cost of ‘phishing’ of any password out of DB).

Salt is necessary even if passwords from dictionaries are prohibited, because besides a dictionary pre-computation attack, there is a whole class of pre-computation attacks, including ‘rainbow table’-based attacks. The idea behind pre-computed ‘rainbow tables’ is not trivial, and is beyond the scope of this article (those interested may look at [WikiRainbow]), but it is prevented by ‘salting’ in a pretty much the same way as a pre-computed dictionary attack is.

Offline brute-force attacks on fast hash functions

Even after we have added ‘salt’ to our P’ as described above, and prohibited dictionary passwords, there is still a possible attack on our password DB. 🙁

This attack is known as an offline brute-force attack, wherein an attacker has the whole password DB; it is different from a online brute-force attack, when an attacker simply attempts to login repeatedly (and which can and should be addressed by enforcing a login rate limit).

Wtf hare:As soon as SHA256(S||attempted-P) matches P' – bingo! attempted-P is the real password P for this userTo mount an offline brute-force attack, the attacker needs to have the password DB (or at least one entry out of it, including the password and salt). Then the attacker may simply take this password and salt, and run all possible password variations through our SHA256(S||P) construct; as soon as SHA256(S||attempted-P) matches P’ – bingo! attempted-P is the real password P for this user.3

Brute-force attacks, such as the one described above, are practical only if the number of possible passwords is quite small. If we had 2256 (or even a measly 2128) different passwords for the attacker to analyze, a brute-force attack wouldn’t be feasible at all (i.e. all the computers currently available on the Earth wouldn’t be able to crack it until the Sun reaches the end of its lifetime).4

However, the number of possible passwords (known as the ‘size of search space’) is relatively low, which opens the door for a brute-force attack. If we consider a search space consisting of all 8-character passwords, then (assuming that both-case letters and digits are possible), we’ll get (26+26+10)8~=2.2e14 potential passwords to try. While this might seem a large enough number, it is not.

Surprised hare:Modern GPUs are notoriously good in calculating hashesModern GPUs are notoriously good in calculating hashes; also note that the search task is inherently trivial to parallelise. In practice, it has been reported that on a single stock GPU the number of SHA256’s calculated per second is of the order of 1e9 [HashCat]. It means that to try all the 8-character passwords within our 2.2e14 search space (and therefore, to get an 8-character password for a single user for sure), it will take only about 2.5 days on a single stock GPU. 🙁 As mentioned in [SgtMaj], this means that the upper-bound of the cost of breaking the password is mere $39. This is despite having used an industry-standard (and supposedly unbreakable) hash function, and despite the whole thing being salted. 🙁

Note that the attack above doesn’t depend on the nature of the hashing function. The attack doesn’t depend on any vulnerability in SHA256; the only thing which the attack relies on is that SHA256 is a reasonably fast hash function.


3 Strictly speaking, a matching attempted-P may represent a hash collision, if there is more than one attempted-P which corresponds to (S,P’) pair. However, for all intents and purposes attempted-P found in this way will be indistinguishable from the real password P; most importantly, it can be used for impersonation. Also after going through the full search space, the real P will be found too.
4 Rough calculation: 2128=3.4e38. Now let’s assume that there is a billion (1e9) cores on Earth, each able to calculate a billion hashes per second. It would mean that going through the whole 2128 search space will take 3.4e38/1e9/1e9=3.4e20 seconds, or approx. 1e13 years. As the lifetime of the Sun is currently estimated at about 5e9 years, it means that the sun will have enough time to die 2000 times before the search space is exhausted. And for 2256, the situation becomes absolutely hopeless even if each and every atom of the Earth is converted to a core calculating a billion hashes per second.

 

Mitigation #1: Enforce long passwords

What can be done about these brute-force attacks? Two approaches are known in this field. The first approach is to enforce a minimum password length of longer than 8. This can be illustrated by Table 1, which shows that if we can enforce all users having relatively long passwords (at least 10–12 characters depending on the value of the information we’re trying to protect), we might be OK. However, with users being notoriously reluctant to remember passwords which are longer than 8 characters, this might be infeasible; moreover, with the power of computers still growing pretty much exponentially, soon we’d need to increase the password length even more, causing even more frustration for users. 🙁

Password Length Search Space Time on a Single GPU5 Upper-bound Cost of Brute-Force Attack
8 2.2e14 ~2.5 days ~$40
9 1.3e16 ~5 months ~$2,400
10 8.4e17 ~27 years ~$150,000
11 5.2e19 1,648 years ~$9.3M
12 3.2e21 ~100,000 years ~$580M

Going beyond a password length of 12 isn’t currently worthwhile; IMNSHO (in my not-so-humble opinion), any security professional who is trying to protect information which is worth spending half a billion to get with a mere password (i.e. without so-called ‘two-factor authentication’) should be fired on the spot.


5 assuming really brute force and not accounting for abusing skewed stats of human-generated passwords

 

Mitigation #2: Use intentionally slow hash functions

Hare thumb up:As noted above, to mount a brute-force attack, an attacker needs our hash function to be reasonably fast. If the hash function is, say, 100,000 times slower than SHA256, then the attack costs for the attacker go up 100,000-fold.As noted above, to mount a brute-force attack, an attacker needs our hash function to be reasonably fast. If the hash function is, say, 100,000 times slower than SHA256, then the attack costs for the attacker go up 100,000-fold.

That’s exactly what people are commonly doing to protect themselves from a brute-force attack on a stolen password DB – they’re using hash functions which are intentionally slow.
Several intentionally slow hash functions have been developed for exactly this purpose, with the most popular ones being PBKDF2, bcrypt, and (more recently) scrypt. As of now (mid-2015), I would suggest scrypt – which, in addition to being intentionally slow, is specially designed to use quite a lot of RAM and to run quite poorly on GPUs while being not-so-bad for CPUs. EDIT: after the original article was published, Argon2 has been selected as a winner in a PHC contest. As a result, it became at least as worthy contender as scrypt; which one to choose is arguable as of now (with the main argument against Argon2 being that it is not mature enough yet); in practice, I don’t expect too much difference.

All such intentionally slow functions will have some kind of parameter(s) to indicate how slow you want your function to be (in the sense of ‘how many computations it needs to perform’). Using these functions makes sense only if the requested number of computations is reasonably high.

The next obvious question is, ‘Well, how big is this ‘reasonably high’ number of calculations?’ The answer, as of now, is quite frustrating: ‘as high as you can afford without overloading your server’. 🙁

Note that when choosing load parameters for your intentionally slow hash function, you need to account for the worst-possible case. As noted in [SgtMaj], in many cases with an installable client-app (rather than client-browser) this worst-case scenario happens when you’ve got a massive disconnect of all your users, with a subsequent massive reconnect. In this case, if you’ve got 50,000 users per server, the load caused by intentionally slow hash functions can be quite high, and may significantly slow down the rate with which you’re admitting your users back.6


6 While caching users’ credentials to avoid overload at this point is possible, it is quite difficult to implement such a feature without introducing major security holes, and therefore I do not recommend it in general.

 

Mitigation-for-mitigation #2.1: Client + Server hashing

To mitigate this server-overload in case of massive reconnects, several solutions (known as ‘server relief’) have been proposed. Most of these solutions (such as [Catena]), however, imply using a new crypto-primitive7, which is not exactly practical for application development (that is, until such primitives are implemented in a reputable crypto library).

One very simple but (supposedly) efficient solution is to combine both client-side and server-side hashing. This approach, AFAIK, was first described in a StackExchange question [paj28], with an analysis provided in [SgtMaj].


7 A basic number-crunching crypto-algorithm, acting as a building block for higher-level protocols. Examples of crypto-primitives include AES, SHA256, and [Catena]. The problem with introducing a new crypto-primitive is that they’re usually quite difficult to implement properly, so for application-level programmer it is usually better to wait until a crypto library does it for you.

 

Client + Server hashing

The ‘Client + Server’ password hashing schema works as follows:

  1. User enters password P
  2. P’ is calculated (still on the client) as:
    client_slow_hash(SiteID||UserID||P)
    where SiteID is unique per-site string, UserID is the same ID which is used for logging in, || denotes concatenation, and client_slow_hash is any of the intentionally slow hash functions described above.
  3. P’ is transferred over the wire
  4. Hare with an idea:While you still need to have both of your hash functions as slow as feasible, Client + Server hashing may allow an increase from 10x to 100x of the brute-force-attack-cost-for-the-attacker, which is not that small an improvement security-wise.on the server side, P” is calculated as:
    server_slow_hash(S||P’)
    where server_slow_hash may be either the same as or different from client_slow_hash, and S is a crypto-random salt stored within user DB for each user.
  5. P” is compared to P” stored within DB. P’ is never stored in database.

This approach shifts some of the server load to the client. While you still need to have both of your hash functions as slow as feasible, Client + Server hashing (when you have an installable client app rather than a browser-based app) may allow an increase from 10x to 100x of the brute-force-attack-cost-for-the-attacker [SgtMaj], which is not that small an improvement security-wise.

Note that while this Client + Server hashing might seem to go against the ‘no-double hashing’ recommendation in [Crackstation], in fact it is not: with Client + Server it is not about creating our own crypto-primitive (which is discouraged by Crackstation, and for a good reason), but rather about providing ‘server relief’ at minimal cost (and re-using existing crypto-primitives).

WebCrypto WebCrypto, more formally Web Cryptography API, is a W3C candidate recommendation, essentially aiming to provide access from JavaScript to fast browser-implemented crypto-primitives.On the other hand (unless/until [WebCrypto] is accepted and widely implemented), this Client + Server hashing won’t be helpful for browser-based apps; the reason for this is simple – any purely Javascript-based crypto would be way too slow to create enough additional load to bother the attacker.

What MIGHT happen in the future

In the future, things might change. Very recently, a ‘Password Hashing Competition’ has been held [PHC], looking for new crypto-primitives which allow for better ways of password hashing; while they don’t seem to apply any magic (so being intentionally slow will still be a requirement for them), there is a chance that one of them will become a standard (and becomes implemented by the crypto-library-you’re-using) sooner or later. When/if it happens, most likely it will be better to use this new standard mechanism.

EDIT: since original article was published, Argon2 has been selected as a winner in PHC.  

Bottom line

As long as a new standard for password-hashing is not here yet, we (as app developers) need to use those crypto-primitives we already have. Fortunately, it is possible and is reasonably secure using the approaches described above.

When implementing login for an installable client-app, I would suggest to do the following:

  • Encrypt the whole communication with your client. If you’re communicate using TCP, use TLS; if you’re communicating using UDP, use DTLS; for further details see [NoBugs]. Sending password over an unprotected connection is something you should never do, NEVER EVER.
  • Arguing hare:Implement Client + Server Hashing as described above, configuring both client-side and server-side functions to be as slow as feasibleImplement Client + Server Hashing as described above, configuring both client-side and server-side functions to be as slow as feasible
    • As of now, scrypt is recommended to be used on both client-side and server-side. EDIT: since original article, Argon2 has won PHC, and it MAY be used instead of scrypt.  
    • For the client side, load parameters which are based on the maximum-allowable delay for the slowest-supported client hardware should be used.
    • For the server side, load parameters which are based on the maximum-allowable delay in the worst-possible-case (for example, in case of massive reconnect if applicable) for the server-hardware-currently-in-use.
  • Set the minimum password length to at least 8
  • Allow a “virtually unlimited” maximum password length (you will still need to put some limit on it, to avoid protocol-related attacks such as “transferring 100G ‘password’ just to bring your server down”, but, say, 1024 bytes should be good enough)
  • Prohibit passwords which are in well-known password databases (and enforce this prohibition)
  • Enforce password changes (which will be a separate and quite painful story)
  • Do think how you will provide ‘password recovery’ when you’re asked about it (and you will, there is absolutely no doubt about it). While ‘password recovery’ is a fallacy from a security point of view, there is 99% chance that you will be forced to do it anyway, so at least try to avoid the most heinous things such as sending password over e-mail (and if you cannot avoid it due to ‘overriding business considerations’, at the least limit the password validity time slot, and enforce that the user changes such a password as soon as she logs in for the first time).
  • Implement two-factor authentication at least for privileged users, such as admins.
  • Implement a login rate limit (to prevent online brute-force attacks)
    • With the precautions listed above, pretty much any reasonable limit will protect from brute-force as such (even limiting logins from the same user to once per 1 second will do the trick).
    • On the other hand, to avoid one user attacking another one in a DoS manner, it is better to have two limits: one being a global limit, and this one can be, say, one login per second. The second limit may be a per-user-per-IP limit, and this needs to be higher than the first one (and also may grow as number of unsuccessful attempts increases). With these two limits in place, the whole schema will be quite difficult to DoS.

Judging hare:Phew, this was quite a long list, but unfortunately these are the MINIMUM things which you MUST do if you want to provide your users with the (often questionable) convenience of using passwords. Phew, this was quite a long list, but unfortunately these are the MINIMUM things which you MUST do if you want to provide your users with the (often questionable) convenience of using passwords. Of course, certificate-based authentication (or even better, two-factor authentication) would be much better, and if you can push your management to push your users to use it – it is certainly the way to go, but honestly, this is not likely to happen for 99% of the projects out there . Another way is to rely on Facebook/whatever-other-service-everybody-already-has logins – and this is preferable for most of the apps out there, but most likely you will still need to provide an option for the user to use a local account on your own site, and then all the considerations above will still apply. 🙁

For browser-based apps, the schema would be almost the same, except for replacing ‘Implement Client + Server hashing…’ with:

  • Implement Client + Server hashing without client_slow_hash, i.e. with P’=P. Configure server-side function to be ‘as slow as feasible’
    • As of now, scrypt is recommended to be used on both client-side and server-side. EDIT: make it “scrypt or Argon2”
    • For the server side, load parameters which are based on the maximum-allowable delay in the worst-possible-case (for example, in the case of a massive reconnect if applicable) for the server-hardware-currently-in-use.

Note that when/if WebCrypto is widely adopted, browser-based apps should also move towards fully implemented Client + Server hashing as described for installable client-apps.

Don't like this post? Comment↯ below. You do?! Please share: ...on LinkedIn...on Reddit...on Twitter...on Facebook

[+]References

[+]Disclaimer

Acknowledgements

This article has been originally published in Overload Journal #129 in October 2015 and is also available separately on ACCU web site. Re-posted here with a kind permission of Overload. The article has been re-formatted to fit your screen.

Cartoons by Sergey GordeevIRL from Gordeev Animation Graphics, Prague.

Join our mailing list:

Comments

  1. Scott says

    “Allow a maximum password length of at least 12, preferably 16”

    Is this recommending setting a max length for passwords? I would think if someone is willing to have a longer password (I use a 25 character password ala http://xkcd.com/936/), they should be lauded for it.

  2. Earl says

    Great article, very like it.

    One comment.
    ‘Allow a maximum password length of at least 12, preferably 16’ – limiting password max length may be annoying for the user, who want to put some mnemonics. I see some solutions that do silent pruning to some max allowed length, without generating the error. What do you think about such approach?

    • "No Bugs" Hare says

      I don’t really like it (what if all the entropy in your password is in the ignored part?) Also IIRC it was used in MS LM hash, and has caused problems. I’ve changed the wording.

      • Jamey Sharp says

        I don’t understand. Why would you ever limit the maximum length? You’re going to hash the password anyway, so you only have to store the size of the hash. I suppose at some point having someone do a 1MB HTTP POST because their password is a million characters long could lead to some resource exhaustion issues, but is that a serious concern?

        • "No Bugs" Hare says

          You’re right, with a clarification: there still should be SOME limit (ALL fields MUST have SOME kind of size limit as a protection from protocol-related attacks, like “I’m attacking you via transferring 100Gigabyte ‘password’ just to bring your server down”). Setting the maximum length of password to, say, 1K, is perfectly feasible, shouldn’t affect performance in any way, and is not really restrictive (I didn’t hear of users willing to type 1000 characters in on login; in fact, 100+ bytes is already very rare). I’ve changed the wording in that place once again.

  3. says

    Very informative for our team, thank you for Great article,
    I will suggest one thing to readers that always make your password strong with using some symbols and arabaic characters. It always good to make our password strong and length of your password must be 12 character.

  4. Dmitry says

    There are some cases when you need to have actual password, example is to be able to connect to RDBMS.

    What would your advice to handle storage of “plane password to be used”.

    • "No Bugs" Hare says

      It is a tough question without a good answer :-(.

      First of all, let’s note that this (storing password on one server to authenticate to another server such as RDBMS; in fact, it is server acting as a client) is VERY different from what this article is about (storing password DB on the server-side).

      Second: AFAIK, for storing passwords on server-acting-as-a-client, there is no really good solution :-(. A few recommendations in this regard include:

      – if your DB supports it, replace password-to-authenticate-to-your-DB with a certificate; won’t really help too much, but is one of those “best practices”

      – at the very least, make sure that nobody-except-your-service-account has permissions to access the file with the password

      – if possible, encrypt your password/certificate with another manually entered “master password”; in theory, it should work as follows: on each reboot, your service doesn’t start automatically, but waits for admin to come in and enter the “master password”; then this “master password” is used to decrypt stored password/certificate. Admittedly, this is a REAL pain in the neck, so personally I will NOT blame you if you can’t do it (but security guys will)

      – if it is not possible to encrypt your password/certificate with a manually entered “master password”, my current recommendation is to encrypt it with something-unique-on-the-same-server (like UUID of / filesystem on Linux, PLUS something-else of the same type). Note that this encryption is inevitably only a security-by-obscurity (but still preventing trivial attacks such as “copy the file with the password, that’s it”, so I am not buying a popular argument that “it is security by obscurity, so it is useless”). However, to keep the obscurity up (and associated kinda-protection-from-not-so-dedicated-hacker) you DO need to take more than one of such unique things and combine them (and not to tell anybody which ones you took or how you combined them).

  5. mlb says

    Good article. I found nearly all the answers I needed.

    But I have a problematic to secure my app on windows with a strong login authentication. I can’t use any network, so all has to be done on the same machine.

    Any advice on the subject ?

    Thx.

    • "No Bugs" Hare says

      > But I have a problematic to secure my app on windows with a strong login authentication.

      Can you elaborate what is the purpose of this authentication (i.e. what is the thing you’re trying to protect)? Are you trying to authenticate the user to protect his data (but he’s already authenticated locally as he logged in into Windows/Linux/…) – or you’re trying to authenticate the copy of your software to protect yourself from copying (which is a very very different story), or something else?

      • mlb says

        Sorry for the delay.

        I try to protect text files, and SQLite databases on a Win10 system. Theses files must be protected to users.
        You can access to theses files only through the software.

        My idea was to create another Windows account. Call it “Software Account” (the user don’t know the password). In this one I have all the authorization on the files for my software.
        I create a logon system running as a Windows Service. If the authentication is correct, I run the software with the account rights “Software Account”. user / pwd are stored in a system local account so a user can’t access to it.

        With this system, a user don’t have the rights on the files but can access it through the app.
        I can have multiple users who can login through the login service but only one can be log.

        Another question is how to protect the communication between the users environment and service ? I was thinking of using SSL to protect the stream.

        thx,
        mlb

        • "No Bugs" Hare says

          So you’re trying to protect files/databases _from_ your own user, am I right?

          First of all, we need to realize that while we can _try_ to protect in such environments – it cannot possibly be a real protection, and a dedicated attacker will break it sooner or later (usually sooner).

          In other words – we’re only speaking about _obscurity_ (and not about real security).

          Another Windows account could work for this purpose – but only if you have means to disable admin access to the machine (otherwise taking over any account is trivial). In turn, disabling admin access is feasible only if you’re going to install your app ONLY to corporate machines in a company-which-already-enforces-no-admin-rights-for-users-policy.

          If you’re not in such an environment – separate account will fall very quickly and very obviously. If you need to try to protect your files/database from home users – TBH, you’re pretty much out of luck (if they want your data – they will get it); if you still want to put some protection – IMO your best bet would be to “encrypt” your files (and records within DB) – though, as you’ll need to have your encryption key within your app – protection won’t last long against a dedicated attacker :-(.

  6. mlb says

    Hi,

    Thanks for your reply.

    We provide the computers with the soft on it. So we can manage the admin rights on it. Otherwise it’s not even possible to protect your data.

    How we tried to implent it :
    All the files are on the “user” session and a login service is running. We protect the encryption key and loging/password under a admin account.
    If the login is sucessful, we return the key to our app to uncrypt the data in memory.
    We just need to implement secure sockets between our app and the service.

    • "No Bugs" Hare says

      So you’re lucky, I guess 🙂 . Please keep in mind that strictly speaking, it still doesn’t protect your data (as the attacker can reverse-engineer your app and get the key from its memory) – so you’re still in the realm of security-by-obscurity. OTOH, with a very limited distribution (which is inherent to such deployments) – and if there isn’t _that_ much incentive to get the data (it isn’t nuclear access codes you’re protecting, right?) – well, it MIGHT fly :-).

      BTW, under such circumstances I’d rather advise NOT to give the key to the data to the unprotected user space, but rather run the logic in that service (which runs, I guess, under LocalSystem account) – and only visualize the data under the unprotected account. So, it will be the service (and ONLY the service accessing those critical files), and only processed data will ever go to the unprotected space; think of this service as of a kinda-web-server (and of your Client – as of kinda-browser).

      By doing it this way – you’ll get two benefits: (a) it will be _much_ more difficult to get _all_ the data; (b) if you insert logging of accesses (and will send logs to some central location) into this service – you’ll be able to see that something fishy is going on (and take non-technical countermeasures).

      > Otherwise it’s not even possible to protect your data.

      As noted above – it is not possible in any case. In general – all the data which is displayed to the user – can be broken. However, it MAY be made more difficult for the attacker to get _all_ the data (the system I described above – goes in this direction).

      > We just need to implement secure sockets between our app and the service.

      Secure sockets are easy (just use OpenSSL – or whatever-other-TLS-implementation-you-like – and make sure to check Server certificate on Client side); what is not easy – is to establish a secure channel coming from an inherently insecure Client. In general – it all boils down to a question “what is that certificate which your Client trusts?”; very shortly: in such hostile environments – it SHOULD be a certificate embedded into your Client (and not one of root certificates in some system storage); moreover – this certificate SHOULD be obfuscated (scrambled, encrypted-with-a-key-stored-within-Client, etc.). It still doesn’t provide real protection – but does raise the attack cost over one of the most obvious attack vectors.

  7. says

    I feel like your article ought to at least mention the fact that the entropy is much lower for human-made passwords than the theoretical entropy, and password cracking software targets human-made passwords.

    • "No Bugs" Hare says

      IMO, it is not that directly related to the subject of “how to hash passwords”, but I added a footnote about it, thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.