8

I'm in the process of writing a development philosophy document for a small groups of developers (6.5 developers to be specific), but it's ideally a document that will set down the company's best practices as our team scales up. In the process of creating this document I've been interviewing each engineer to understand their development processes, to make something that's a good mix of helpful in eliminating redundant work, and not overly prescriptive.

Most of the issues I've found there are pretty good solutions for, but there's something that I assume must be a common issue that I can't seem to find a good solution for. That's managing environment variables that are required for local development. For instance, we've got a development key for AWS. It's required to use a lot of functions and features within the application we're developing.

At present the process is pretty bad. Obviously we can't commit an env file to Github as it's sensitive information that we don't want the entire org to have access to. Right now we have a secure S3 file that contains the env file required for development.

We've found that there are a lot of issues both keeping this up-to-date both in terms new env variables making it into this file, and pulling this down during development. It's not very surprising since it's a single piece of the development cycle that's not a part of Github, and it's something that only has to be updated or accessed pretty rarely. That said, when people run into issues it can be very frustrating since realizing that you've got the wrong S3 key, or something isn't a standard kind of error-checking.

Are there any solutions for managing environment variables that people have found work particularly well? We had an employee write a repo that is the best solution we've currently got (https://github.com/sihrc/privvy), but is there something better?

EDIT: The reason why private repositories are not the solution for this problem (we're already using private repos)

  1. These variables aren't limited to a single project. The development env file for instance is

  2. Version-controlling this file would make development difficult or impossible. If we roll an API key, that's the API key that we need to use forever more. If you need to revert to an older revision and the key changes, that's a problem

  3. It's not strange for us to have to add a third-party to a repo. It's VERY important that they be able to view the code, but not the credentials.

  • Easy: Don't use environment variables. They're not version controlled, they're stringly typed (i.e. they're not typed), and they're another dimension of configuration necessary to get a new machine going. – Alexander Jul 12 '17 at 18:54
  • @Alexander what would you suggest as an alternative? – Slater Victoroff Jul 12 '17 at 18:57
  • A version-controlled config file (`XML`, `JSON`, `YAML` or whatever), a config manager that parses it and lets consumers query it in a strongly-typed manner (i.e. returning an `Int` where appropriate, rather than a string containing `Int`. – Alexander Jul 12 '17 at 18:59
  • @Alexander that... doesn't solve my problem at all. The fact that it's an environment variable rather than an XML file is immaterial. It's about using privileged information in a development flow. Committing our AWS creds to git is exactly what we're trying to avoid. – Slater Victoroff Jul 12 '17 at 19:00
  • I failed to mention this earlier: it would require you have an internal git repository. – Alexander Jul 12 '17 at 19:06
  • Your very question boils down to "It's annoying to have this one component of the dev workflow that's not in git", to which the only real solution is "put that component of the dev workflow in git". The only way to do that securely is to host your own repo, which isn't particularly unreasonable. – Alexander Jul 12 '17 at 19:08
  • @Alexander the real question is: I have a file (we use docker so our env variables are all just one config file) that I need to keep secure, while the rest of the repo I do not have to keep secure. What is a good workflow for keeping a single file secure without introducing undue development friction. Hosting our own repo both doesn't actually solve this problem, and introduces more difficulty into the process than using privvy. – Slater Victoroff Jul 12 '17 at 19:12
  • @Alexander Additionally, there's no need for version control of this file. Commits or diffs on this file aren't going to be meaningful, and there should never be a reason to have anything other than the most recent version of this file, even if you're running old code. Changes are necessitated by things like rolling API keys, and happen apart from the rest of the development workflow. – Slater Victoroff Jul 12 '17 at 19:16
  • 2
    Totally disagree about versions control and creds. You definitely want a history of who changed what creds and when. What you should be doing is using a separate repo from the actual code to store the config files and ensuing that the access rights to the credential repo is on a need-to-know basis. During the build and deploy steps of your automated deploy system you should get the latest creds from the repo and combine them with the app code. Environment variables might be a good way to do that. But versioning credentials is very useful. – RibaldEddie Jul 12 '17 at 19:59
  • Additionally, some SCM systems like git for example allow you to use a GPG key to sign commits. Strongly recommend having an authority in your org who signs commits for your credential repo and allow your automated deployment system to validate the signature on the credentials before going ahead with deployment. There are attack surfaces where a rogue employee could change a credential temporarily or revoke and renew an API key in order to steal credentials. By using version control and signed commits you have a way of ensuring that rolling creds is harder. – RibaldEddie Jul 12 '17 at 20:10
  • @SlaterTyranus: if you think you do not need the script for setting the environment variables under version control, what is hindering you to put it simply in a shared network folder? You can use the security system of your environment to make sure only the right people can access it, and implement an automated process which runs this script for each user by certain events (like logging in to the network, or before the credentials are used). If you need this, use fallback-strategy if the script is not accessable, for example by asking for entering the credentials manually. – Doc Brown Jul 12 '17 at 20:45
  • ... and do yourself a favor and separate the confidential parts of the script from the other parts. If you have a situation where, as you wrote, "new env variables make it into this file" frequently, I assume you did not do this. I would no be astonished if you could 80 to 90% of the contents of the script put under version control if you simply do this separation. – Doc Brown Jul 12 '17 at 20:50
  • Related: [Where should you store/how should you control access to application secrets?](https://softwareengineering.stackexchange.com/questions/302820/where-should-you-store-how-should-you-control-access-to-application-secrets) – Doc Brown Jul 12 '17 at 20:53
  • @DocBrown Shared network folder is a great idea! I hadn't considered that. Some of the other suggestions here seem pretty decent as well, but in terms of ease of use and security I think this is the most likely one to implement. – Slater Victoroff Jul 12 '17 at 21:56
  • @DocBrown Also, there's no script. This is just the .env config file for a docker container. – Slater Victoroff Jul 12 '17 at 21:57
  • @SlaterTyranus: I don't know if the .env files are flexible enough to allow a the separation I suggested, but if not, consider to create a script which checks if the file is in place or up-to-date. In case it is not, the script generates the .env file on the fly. For this generation, it may take some kind of template file (which is under version control, as well as the script itself), and fill out the missing credentials by getting them from the shared network folder. Calling this script might be part of the script which is used to start docker. – Doc Brown Jul 13 '17 at 05:28

1 Answers1

9

Obviously we can't commit an env file to Github as it's sensitive information that we don't want the entire org to have access to.

The fact that the information is sensitive doesn't mean it shouldn't be under version control. Similarly, the fact that you're using GitHub doesn't mean all your information is public. GitHub Enterprise has private repositories, and within your company, individual repositories can be restricted to a subset of employees.

Right now we have a secure S3 file that contains the env file required for development.

That's not the right approach. Instead of having your configuration in one place—the version control, you're putting information in two different locations, one being rather difficult to access. While S3 supports versioning, it's not as straightforward as in Git. Finally, unless you somehow configured S3 to use your corporate SSO server, you're forced to create additional accounts for every user who needs to access the S3 file. This is an opened door to severe security issues (like in “Hey, I can't access the S3 file. Seems my account has an issue again!”; “Well, just use mine. The password is ...”)

We've found that there are a lot of issues both keeping this up-to-date both in terms new env variables making it into this file, and pulling this down during development.

Obviously. As explained below, two sources increase complexity.

Are there any solutions for managing environment variables that people have found work particularly well?

Private repositories.

Another solution I've seen in several companies is to publish the settings into a public repository, but encrypt it. The problem with this approach is often the fact that the encryption key should be stored somewhere as well, and, more importantly, shared. This leads to two issues:

  • When a disgruntled employee has your keys, you have to decrypt and reencrypt everything with a new key, and share the new key to every concerned person.

  • Everyone shares the same key. It might be OK for a small team (although I don't see a reason why this would be the preferred approach), but will quickly become an issue if the key needs to be shared with outsiders: consultants, interns, or people from other teams.

Additionally, there's no need for version control of this file. Commits or diffs on this file aren't going to be meaningful

The first time you'll get your S3 file screwed by a disgruntled employee, you'll find that having a permanent history of every change is essential.

Same for diffs. Spending hours figuring out that someone replaced by mistake the wrong key, and then trying to figure out how to get the right keys back is much less fun when you can just track all the changes step by step.

The reason why private repositories are not the solution for this problem (we're already using private repos)

  • These variables aren't limited to a single project. The development env file for instance is

GitHub Enterprise has a branch restrictions feature. Although it is not as granular as directory-per-directory accesses, it may do the job.

Otherwise, you may switch from GitHub to a service which makes it possible to restrict accesses per directory. Personally, I'm using a SVN repository which, among others, contains the configuration of every virtual machine I deploy: one part is publicly available, but the other part which contains confidential data such as the secret keys to Amazon AWS, Google and Twilio services, is restricted to named persons. It consists of a simple directory.

  • Version-controlling this file would make development difficult or impossible. If we roll an API key, that's the API key that we need to use forever more. If you need to revert to an older revision and the key changes, that's a problem

I don't think that there is an actual solution which works in every case. For instance, what if a revision you made two days ago migrated from Bing Maps to Google Maps? Meanwhile, you may have removed all the Bing Maps API keys; therefore, if someone reverts to an older version, it won't work, and there is nothing you can do.

Otherwise, when it comes to the changes in API keys, this should be pretty rare. Unless you have a security breach, there is little use in changing those keys on regular basis.

  • It's not strange for us to have to add a third-party to a repo. It's VERY important that they be able to view the code, but not the credentials.

Same here. Branch restrictions in GitHub Enterprise, or per-directory permissions in version control systems which support this feature.

Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
  • `publish the settings into a public repository, but encrypt it.` that just kinda pushes the problem back by one level. Now you have the same problem, but with regards to this key rather than the api keys (or other sensitive data) that it encrypts – Alexander Jul 12 '17 at 19:24
  • I think there's an important delta here, which is the difference between "the entire org" and the public. We're already using private repos. I'll add a couple of examples to the question since it's unfortunately not clear. – Slater Victoroff Jul 12 '17 at 19:27
  • @Alexander: exactly. This is why I don't like this approach. I listed it in my answer because this approach seems to be a current practice, but I don't encourage it. – Arseni Mourzenko Jul 12 '17 at 19:27
  • @SlaterTyranus: then what prevents you from giving access to the repository to only a team or some members of a team? – Arseni Mourzenko Jul 12 '17 at 19:27
  • @ArseniMourzenko See above. Often we need to give access to the repository to people outside the organization – Slater Victoroff Jul 12 '17 at 19:32
  • This. Upvoted. Private repos are the answer. Not sure why you would want to avoid keeping track of changes. – RibaldEddie Jul 12 '17 at 21:06