One large financial client we do business with has a standardized automated process for obfuscating data. We don't, so I have a few scripts where I do this by hand. The point is to leave reasonably realistic data (lengths of names, postal codes) while rendering the personally identifiable data irretrievably scrambled. Their system is far more complicated than this, but basically when production data gets copied to development and QA environments, it will be scrambled automatically. This way there is no potential for "forgetting" to do some of the scrambling.
Passwords:
Set them all to something test accounts use: like Password1
or 1234567
.
Tax ID numbers, Social Insurance Numbers, Social Security Numbers:
Take the first 3 digits and generate random numbers for the remainder. In the US, the first 3 digits are generally assigned based on where you lived when the SSN was issued, so not all combinations of first 3 digits are valid. For EINs, take the first 2 digits, as not all combinations of first 2 digits are valid. Adjust which digits get left alone if your country uses different rules.
Names:
Hash and base64 the first and last names separately. Take the first letter of unhashed name append the hash afterwards and truncate the result to original name's length
Example:
Name = "John Doe" (I am using SHA384)
So John Doe
gets turned into Jnbn Dnh
. It helps to keep the names the same length as that may help to point out usability issues.
If you have rules such as "names cannot have digits" then you need to remove out the base 64 values that aren't valid, also lowercasing the subsequent letters (done in sample code below).
Addresses:
Street names and city names get hashed as names above do. Numbers stay the same. State and zip stays the same.
So 1313 Mockingbird Lane
becomes 1313 Mvtqiwtuqrd Lzzx
Phone numbers:
Leave area code the same, generate random digits for the remaining digits.
Credit Card Numbers:
You should not be storing these at all.
Here is some sample & crude C# code for hashing and truncating (simple to display the concept)
using System.Security.Cryptography;
using System.Text.RegularExpressions;
public string ScrambleInput(string sInput)
{
string sReturn = sInput.Substring(0,1);
string sTemp = string.Empty;
System.Security.Cryptography.SHA384Managed Hasher = new SHA384Managed();
System.Text.ASCIIEncoding enc = new System.Text.ASCIIEncoding();
byte[] buff = new byte[sInput.Length];
buff = enc.GetBytes(sInput);
Hasher.ComputeHash(buff);
sTemp = Convert.ToBase64String(Hasher.Hash, 0, Hasher.Hash.Length, System.Base64FormattingOptions.None);
sTemp = sTemp.ToLower().Replace("+", "").Replace("/", "");
sReturn += Regex.Replace(sTemp, @"\d", "");
sReturn = sReturn.Substring(0, sInput.Length );
return sReturn;
}