77

I work in C# and MSSQL and as you'd expect I store my passwords salted and hashed.

When I look at the hash stored in an nvarchar column (for example the out the box aspnet membership provider). I've always been curious why the generated Salt and Hash values always seem to end in either one or two equals signs.

I've seen similar things while working with encryption algorithms, is this coincidence or is there a reason for it?

Liath
  • 3,406
  • 1
  • 21
  • 33
  • 20
    As an aside, if you are storing Base64-encoded binary data in an NVARCHAR field, I weep for your 6x storage waste! For one, Base64 can only contain 64 of the characters in the lower half of ASCII (so you only need VARCHAR to save half). For two, Base64 explodes each byte of data into 1-4 characters. SQL Server already has a VARBINARY type that is quite capable of storing your hashes without bloat from encoding, and doesn't care about collation in its comparisons... :-) – jimbobmcgee Jun 17 '14 at 15:28
  • 1
    If you're "hashing" via Base64, that's not a hash alg at all. It's fully and easily reversible. (Maybe there are hash algs that pad with equals signs. Not sure.) –  Jun 17 '14 at 17:44
  • 3
    @WillieWheeler, the hash has to be stored somehow. Base64 might not be the perfect storage medium, but there's nothing inherently wrong with it. If `hash("my password")` produces the array `[1,2,3,4,5]` and I need to store those values in a database, there are worse choices than storing the string `AQIDBAU=` (Of course, if the hash function in use is _already_ producing a string, it seems a bit silly to then Base64 encode it.) – Brian S Jun 17 '14 at 20:42
  • @BrianS In general people hash and salt passwords for security purposes. Base64 provides no security since it's reversible using a standard algorithm with no secret. –  Jun 17 '14 at 21:27
  • 2
    @WillieWheeler I think you are missing the point. Re-read what Brian S wrote - he was not talking about properties of base64 for hashing - that would be absurd base64 is not a hashing algorithm. He is saying that there is nothing wrong with storing a hash (produced by a hash function/algorithm) in base64 form. – Andrew Savinykh Jun 17 '14 at 21:29
  • 3
    The OP says that the he stores the hash and it ends with equals signs. That suggests that he's confusing hashing with Base64 encoding. If the point is that it's fine to base64 encode a hash, then of course, but what does that have to do with anything? –  Jun 17 '14 at 21:37
  • @jimbobmcgee, the aspnet membership provider uses varchar, I tend to use varbinary but thanks for the advice! – Liath Jun 18 '14 at 06:38
  • 1
    @WillieWheeler, correct. Don't worry, I am hashing properly! The hash algorithm converts it to a binary array. I'm asking about the string representation of this array (or similar encoded data presented as a string). – Liath Jun 18 '14 at 06:40
  • 2
    Yeah, I finally realized what is going on. I looked in my SSH public and private key files and noticed that they also have the =/== endings. These are base64 encodings of byte arrays, like BrianS and zespri describe. Thanks guys. –  Jun 18 '14 at 06:47
  • 1
    @jimbobmcgee Fair point, but your math is faulty. Storing binary data as base64 in an nvarchar column is at most a 1.7x storage waste. In varchar, it would be a 0.3x storage waste (33%). – JLRishe Jun 18 '14 at 12:22
  • @JLRishe - not saying you're wrong but how do you get to 1.7x? If you are taking a byte (256 possible values) and encoding it into Base64 chars (64 possible values) you are going to need anywhere between 1 and 4 chars to represent your byte. On top of that, nvarchar stores 2 bytes per character. Nothing in the Base64 character set requires the extra byte, so you end up storing charbyte-nullbyte for each Base64 char. Assuming three chars per byte (base64) and two bytes per char (nvarchar), you get ~6x waste (plus a 2 byte uint for nvarchar length and one bit for nvarchar nullity!) – jimbobmcgee Jun 18 '14 at 18:43
  • @jimbobmcgee Where you've gone wrong is assuming three chars per byte. That's a wildly incorrect assumption. Base64 uses, on average, 1.333... characters per byte of data. This means that binary data stored in a varchar column will use 1.333... times the amount of original space (0.333x waste). nvarchar uses double that amount, making it 2.666... times the original space. Hence, 1.7x waste. – JLRishe Jun 19 '14 at 08:28
  • 1
    @JLRishe - Fair enough -- I was looking at it byte-by-byte and forgetting that Base64 encodes three bytes at a time (which is, ironically, why those `=` chars are there in the first place). I guess I've also seen some horribly-optimised storage of MIME files, in the past! – jimbobmcgee Jun 19 '14 at 11:13
  • This comment thread is sad. I weep for the human race. – Jim Balter May 08 '19 at 13:37

2 Answers2

105

These hashed string are (usually?) coded in the Base64 format and the equal sign are used for padding the string to make the length (number of bytes) divisible by three. Wikipedia explains it pretty well: http://en.wikipedia.org/wiki/Base64.

Caleb
  • 38,959
  • 8
  • 94
  • 152
svenslaggare
  • 976
  • 1
  • 7
  • 4
41

Could it be Base 64 encoding padding?

The '==' sequence indicates that the last group contained only one byte, and '=' indicates that it contained two bytes. The example below illustrates how truncating the input of the whole of the above quote changes the output padding:

http://en.wikipedia.org/wiki/Base64#Output_padding

Jaydee
  • 2,667
  • 1
  • 18
  • 17