3

From what I understand, in most sites, passwords are stored as a hash; not in their original form. This means that if the database is hacked into, the attacker will see the (nearly) useless hashes instead of the actual passwords.

For the website to authenticate your login attempt, it hashes your attempt of a password, and checks if it matches the stored hash. If they match, it logs you in, otherwise, it denies it.

I've read on many sites though that no hash algorithm can produce 100% unique hashes; collisions will always happen.

Does that mean that, theoretically, there exists another password that will give the same hash given a hash-function, which means there is potentially more than 1 password that could be used to log in with (although the other password would probably be near-random for a complicated hash-function)?

Carcigenicate
  • 2,634
  • 3
  • 24
  • 38
  • see also: [Randomized Hash function with no collisions](http://programmers.stackexchange.com/a/237114/31260) – gnat Jun 28 '15 at 17:37
  • The technical term for this is a "preimage attack", and yes it's possible. A hash function is considered "cryptographically secure" when it makes preimage and other attacks prohibitively difficult to carry out. If you want to learn more, there are many questions on security.stackexchange.com about which hash functions are more or less resistant to preimage attacks versus other kinds of attacks. – Ixrec Jun 28 '15 at 17:46
  • 1
    The answer to the question in the title is "Yes", but it doesn't matter. A decent hash function cannot practically be cracked by brute force, which means these alternative password cannot practically be found. – gnasher729 Jun 28 '15 at 20:54
  • @gnat The requirements for a collision resistant hash (e.g. SHA-2) and a password hash (e.g. PBKDF2, bcrypt, scrypt) are quite different. In particular the latter is slow, salted and only requires first pre-image resistance not collision resistance or second preimage resistance. So I'm not sure if it's appropriate to close a question about password hashes as a duplicate of a question about collision cryptographic/collision resistant hashes. – CodesInChaos Jun 29 '15 at 09:38

1 Answers1

9

There are in fact infinitely many passwords which produce the same hash. That is actually more or less the definition of what it means to be a hash function: reducing a larger (potentially infinite) input space into a smaller finite output space.

However, a "good" hash function will distribute the hash values among the input values in such a way that any similarity in the input does not translate into similarity in the output. For a cryptographic hash function, such as the ones used for password hashing, this requirement is even stronger.

What this means, basically, is that the passwords which have the same hash value tend to be very dissimilar. In particular, most tend to be very long, very unreadable, cryptic apparently random strings.

Jörg W Mittag
  • 101,921
  • 24
  • 218
  • 318
  • Infinity many is a bit of an exaggeration. – paparazzo Jun 28 '15 at 17:36
  • 8
    @Blam: it's no exaggeration. Strings have unbounded length, so there are countably infinite of them. Hashes have finite length, so there are finitely many. Thus, at least some hashes (all, with a "good" hash function) have countably infinite pre-images. – outis Jun 28 '15 at 17:42
  • 1
    Strings may have unbound length but passwords don't. Your typical password rule is 6 to 32. It may be 128. It will never be unbounded. You don't have unbounded bandwidth or memory. – paparazzo Jun 28 '15 at 17:45
  • 2
    The claim that infinitely many strings will produce the same hash is still accurate, even if only a finite number of those strings can be used as passwords in the real world. – Ixrec Jun 28 '15 at 17:50
  • 2
    @lxrec Maybe that is a valid statement but it is not a valid answer to the stated question. The stated question is not infinite number of strings the stated question is password. There are not infinitely many passwords as there is no authentication in practice that allows an infinitely long passwords. There is not even a computer that allows infinitely long strings. – paparazzo Jun 28 '15 at 17:59
  • Strings are not infinite length. Many languages (Java and PHP for example) limit them to 2GB. (signed 32 bit int max) Even if the language lifted this restriction, no computer in the world has infinite memory. Even if one did, it would take an infinite amount of time for a user (or some algorithm) to generate the infinite length string. – user949300 Jun 28 '15 at 18:33
  • @user949300: "infinitely many strings" isn't the same as "an infinitely long string". With hashes, there are a countably infinite number of strings of finite length that will get mapped to the same hash. – outis Jun 28 '15 at 19:41
  • Sorry @outis. As implied in your initial comment, if strings are __not__ infinitely long, then they are __countable__, and there is a __finite__ number of them. The phrase "infinitely many Strings" is impossible, though we are getting pretty far off topic here... But, technically, the correct answer is that there are __many__ Strings with the same hash, __not infinitely many__. – user949300 Jun 28 '15 at 20:36
  • @user949300: That's basic maths that you are getting completely wrong here. Countable = infinite. For example, the positive integers are countable. Countable means that you can go through the list of positive integers (or finite length strings) and you will encounter each possible integer or string at some point, but you are never going to finish. – gnasher729 Jun 28 '15 at 20:49
  • @gnasher729 Countable != infinite. Though, you are correct, I misused some terminology, countable _could_ mean infinite. e.g., Wikipedia, "A countable set is either a finite set or a countably infinite set" Anyway, the point is, there cannot be an infinite number of Strings that match a certain hash because there cannot be an infinite number of Strings. – user949300 Jun 28 '15 at 21:39
  • "_In theory, there is no difference between theory and practice. But, in practice, there is._" (Jan L. A. van de Snepscheut) – Idan Arye Jun 29 '15 at 00:35
  • @user949300: note I was explicit that "countable" meant "countably infinite", not "finite". 2nd: there are a countably infinite number of strings for the same reason there are countably infinite positive integers, since strings of digits are bijective with positive integers and a proper subset of strings. Since positive integers are countably infinite, strings must be at least countably infinite. To take a related approach, a string can be considered to be a sequence of bytes, which can be interpreted as a positive integer in base 256. Thus, strings are bijective with positive integers; QED. – outis Jun 29 '15 at 04:31
  • @Blam: again, the answer doesn't talk of infinitely long strings but an infinite number of finite strings. Even if a particular authentication system has a limit on password length, the question is about hash functions–which don't have this restriction–used for some unspecified, theoretical authentication system. Thus, that countably infinite strings will have the same hash value for any hash function is completely relevant. You could argue the question is ambiguous on the matter of password length limits, but as it's been closed, the ambiguity would be moot. – outis Jun 29 '15 at 04:38
  • Password hashes are not necessarily collision resistant. In fact PBKDF2 and scrypt have certain trivial collisions (every password longer than one block has an equivalent, trivial to find password that has the same has. – CodesInChaos Jun 29 '15 at 09:37
  • There being infinitely (countably, uncountably, whatever) many strings implies that there is *some* hash value that infinitely many strings hash to. It doesn't mean that *every* hash value has infinitely many strings hashing to it. – Reinstate Monica Jun 29 '15 at 22:02