Don't keep encrypted secrets in your git repositories, if for no other reason than that it makes access revocation deceptively difficult --- but also because it encourages you to have a development team in which ordinary devs have a full complement of secrets on their laptops at all times.
Instead, keep secrets "out of band" and supply them to applications as part of your deployment process.
We ran into this problem with terraform and we needed a fix quickly. The problem is that they recommend you check in your .tfstate files, which does make sense; they do need to be synchronized between everyone who might be working on the repo. However, we learned later down the line that in some cases, the state files might contain secrets. So we rolled out git-crypt for all .tfstate files to future-proof ourself against accidental checkin of secrets.
Terraform does support a number of out of band state management backends that I would prefer to use, but none of those backends support encryption at rest. Hopefully hashicorp will roll out support for vault as a backend at some point...
S3's "encryption at rest" is transparent (see: http://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-enc...). If someone (or some server) has access to the S3 bucket, they have access to all of the data. Control then is delegated to IAM roles and permissions, not to the crypto model.
> As long as you authenticate your request and you have access permissions, there is no difference in the way you access encrypted or unencrypted objects.
The only advantage of S3 encryption is that if someone walks out of the data center with a disk, they can't read the data on it.
All you need are the same IAM secrets that you already needed for terraform. Keeping the state out of repos and in encrypted buckets is definitely the way to go.
Personally I usually go with env vars for small, one-off deployments. For larger environments with multiple deployables/teams, a tool like Vault (https://www.vaultproject.io/) is pretty handy.
For many purposes passing in secrets by any of the above mechanisms can be fine, though as you noted there are caveats. Whether they matter depends strongly on your situation.
There are some things you can do outside of the physical means of passing the secrets that can help reduce risk:
- Give applications very constrained credentials that only give them access to what they actually need. This won't stop the credentials from getting compromised, but it at least reduces the risk of them doing so and -- assuming the credentials also serve as identification for the application -- allows you to trace back a compromise to the application it came from, which may make recovery easier.
- Use time-limited credentials that need to be renewed periodically. This is harder to implement without logic in the application, but it can potentially be done by having a file on disk (possibly ramdisk) that the application re-reads somewhat often and that gets updated by some other process.
- Pass an application a one-time-usable token via any of your given mechanisms and then have it reach out to some other service with that token to get the real secrets it needs. Since the token is invalidated on first use, the window of compromise is small and in particular this can help mitigate the child-process-related vulnerabilities as long as the app is careful to read its credentials before doing any other work.
Hashicorp Vault combined with the utility "consul-template" (which is now a bit of a misnomer since it talks to vault too) can be helpful building-blocks for implementing these ideas.
As noted above, whether this is warranted or suitable depends on your context and risk-tolerance. All security stuff requires cost/benefit analysis, since nothing is a magic bullet.
What app? They all take credentials different ways. The ideal would be an unprivileged child talking to a privileged parent, where the child asks for the secret and is authorized by the parent, and then both the child and parent erase the [unencrypted] secret when not in use. If your app doesn't do this, executing some app that then feeds the secret via a file descriptor (pipe/socket/etc) (and hopefully erases all traces of it after) is similar.
The simplest hack is to use a configuration management tool to start your app, and have its pre-exec step provide the credential, and post-exec remove any trace of it (which, again, will not work everywhere). The config management tool can be authorized to pull the credential from a remote host by some public-key protocol and a hole in your firewall, or by a local cache of secrets pushed by the config management tool only to the hosts and accounts that specifically need them. These have security and reliability tradeoffs.
You can also use a credential management suite for a more robust solution, of which there are both commercial and free options.
> "The simplest hack is to use a configuration management tool"
> "You can also use a credential management suite for a more robust solution"
You have to start somewhere, this is why automated system with credentials isn't a simple task to solve.
Imagine this, you use consul to store your secrets, as you wrote up a script for Ansible to look things up, this include secrets. Great, but you need to authorize. So you need to keep the token somewhere. It's okay if you have a person typing in, but if you deploy through Jenkins on a regular basis automatically, you need to add that token as a password to Jenkins's data store.
But you provision and configure your Jenkins using your Ansible playbook, and your playbook needs to get to Consul to get to the secret so you can compare the key on Jenkins. Now this is getting tricker.
Okay, say you are on EC2, you can use instance profile to grant instance access to S3 / DynamoDB where you put your ultimate secrets. But if you are paranoid or for some reason you decided to do client-side encryption and keeping the master key to yourself (you generate them and you keep them, never given out to AWS), then you are back to square one. Where the heck do you keep your master key safe? KMS? No? In your own Consul? Okay where should we keep the token? Ugh.
What if you don't use AWS? Google Cloud?
You have to trust at least one place. If your repo can be trusted, let it be. There are three major factors in combating a system that has to put trust in password/cert/key:
* keep rotate the credentials
* everyone should pull from the same credentials management infrastructure
* least privilege
But #2 takes a while to complete, mainly because now you need to develop / rewrite your script on Jenkins to not use Jenkins's password datastore, and the password masking plugin. You have to develop your own masking plugin.
The most trusted place is where you will need to add extreme security measure. Assuming the encryption can't be broken in reasonable amount of time, the only way to keep a secret unrecoverable is if the secret holder is dead ("only one can live if two people know a secret).
I don't think it's quite as complicated as you make it out to be.
1. Write a script that will, when given a specific key, will decrypt a file, extract a specific credential that matches the key, and return it. (Credential Extractor-Writer)
2. Put the CEW file somewhere the devs don't have access. (Credential Store)
3. Make a script to connect to a specific host and port using PKI or public-key auth (HTTPS, SSH), and make it feed a key on stdout, and receive a credential on stdin. (Credential Retrieval Script)
4. Make a script to push new keys out to hosts/services regularly (Host-Service Key Pusher) and a script to change/rotate credentials on the CS (Credential Changer).
(all of that can be implemented with Bash, Openssh, GPG, and 7Zip/Rar/whatever)
For your example, whatever manages Jenkins' configuration can simply call the CRS and save the resulting credential however Jenkins wants it. This covers just about everything except that single point of failure, the CEW file.
To protect that, keep each part of the CEW encrypted by an individual key, so only services requesting their key will get access to the credential, and distribute new keys when a credential is changed. Put expirations on key/credential pairs. Add key signing to everything. You can go farther and plug in google-authenticator and 2FA. Implement a two-man rule. Go crazy.
I am not arguing you can't implement a solution. You literally just repeated what I said by specifying A specific implementation. My point earlier is that you will never be able to keep anything secret. There is always a point of trust you have to give up and hope your extreme defense system doesn't crack / expose.
Ah, I thought you were just saying it's too complicated. Of course nothing is totally secure. But there is a point where "extreme" mitigations are called for, and not considered extreme, and the scope of potential problems of trust can be limited.
>> "The simplest hack is to use a configuration management tool"
>> "You can also use a credential management suite for a more robust solution"
> Imagine this, you use consul to store your secrets, as you wrote up a script for Ansible to look things up, this include secrets. Great, but you need to authorize. So you need to keep the token somewhere.
You see, the problem springs from you mistaking Ansible for configuration
management tool. It is not, it is somebody's deployment script. What you want
is CFEngine, Puppet, or Chef, and you need it download configuration templates
and fill them with secrets on the server that you want configured. (CFEngine
can easily do that; Chef -- probably, since it's not a standalone tool, but
a Ruby framework, so you have ERB at your disposal; Puppet? no idea if it can
use ERB in masterless mode.)
Now you need to solve the problem how to get the secrets to the server being
configured. This would be easy if you had a specialized database service that
ships credentials, and only the ones that are relevant to the server are sent.
It's a pity we don't have such a database (or at least I am not aware of any),
but it's not exactly a famine problem to solve, it's just juggling with access
lists and some simple log replication protocol.
> "You see, the problem springs from you mistaking Ansible for configuration management tool. It is not, it is somebody's deployment script."
You lost credibility from my PoV. Ansible is capable of doing what Chef can do. Ansible can use dynamic inventory which means Ansible can use dynamic secret infrastructure. That's not the default mode of Ansible out of box, but much like Chef and Puppet, Ansible is made up of somebody's deployment script as well. You and I didn't just write up Puppet out of no where, and there are community plugins available for used, and those plugins are just somebody else's deployment script/component.
So is gawk, except nobody tries to shoehorn it to do so or claim that it's
designed for this. Ansible's very architecture and operation model are causing
troubles where should be none, like applying configuration changes to hosts
that happened to be temporarily down.
> [...] much like Chef and Puppet, Ansible is made up of somebody's deployment script as well.
I don't know the history of Chef, but I assure you that Puppet was never
somebody's deployment script that happened to be published and gain traction.
It was designed ground up as a configuration management tool, after its author
deemed cfengine 2.x too troublesome, which again was a configuration
management since its early 2.0 days (and probably since 1.0 version).
Configuration management requires completely different operation mode than
deployment; one is asynchronous and unattended, the other is a synchronous
list of steps to take and often needs to stop on the first trouble to appear.
Git-crypt is but one solution for storing secrets _next to_ the code that uses them. Credstash is another for storing secrets in another, server-accessible location. Both can work with tools like Ansible (or Puppet or Chef or Salt).
From the tone of the above (and your other responses) you're not a heavy Ansible user. That's fine. Use the tool you're comfortable with. But don't try to turn people away from an equally useable alternative to something you _do_ use just because it's not your tool of choice.
Puppet can do templates just fine in masterless mode. Though nowadays, you should use EPP instead of ERB for robustness (strong typing helps a lot)
Your puppet master can be your credential "database" if you use eg. hiera+eyaml. You still need to secure your puppetmaster, obviously, but since it uses public key crypto, you don't even need to make it possible for developers to decrypt secrets. Master key leaks would still be a problem, though.
There are various caveats when working with sensitive data in Puppet (4.something added the "Sensitive" data type which prevents things from leaking into catalogs) but since puppet clients are authenticated with client certificates, you can control which server should have have access to what data.
IIRC this is part of why modern credential management stores provide one-time-use tokens for certain services. You can provide the same with quick hacks, it just takes more hacking.
It requires no additional ops or key management, protects against unwanted overwrites, has much simpler access controls than IAM, and provides a lightweight UI to manage everything.
It uses language SDKs to decrypt and inject config on process startup, so you don't even need an additional deployment step. You just add the 'envkey' library (on rubygems and npm so far), set a single environment variable, and then you can access all your config as if it was stored in local environment variables. The idea is to make this something that you rarely have to think about.
Suggesting environment variables or stdin is avoiding the issue to me.
For it to get into the environment or stdin it has to be stored somewhere.
Where is that somewhere and how do you access it?
Ideally the secrets, and the keys to access them would never be accessible directly by the person who initiated the deployment; they'd be fetched directly from where they're stored to where they're deployed.
git-crypt can be a part of this.
More likely you'd something purpose built like many of the solutions mentioned in this discussion.
If you want something commercial then you might consider https://thycotic.com/ Secret Server.
If not then there is a open source Python based server which uses Shamir's Secret Sharing Scheme. (But apologies I've lost the url)
Those are different implementations of the same thing, the manner of passing secrets. While that sounds like a truism, saying it has value - because the issue isn't passing of secrets but how to protect the secrets before you pass them. In other words, how to leverage external key material that unlocks your secrets.
This change takes the application from security through obscurity to something that uses protected secrets.
I actually am part of the torus / manifold team so thanks for shout out :-). We manage all our infra using terraform and I love being able to do `torus run terraform plan` etc etc. No more tvar files.
You just have to name your envs in torus TF_VAR_NAMEOFTHING so that terraform pics up on them.
Ordinary devs could have access to all the encrypted secrets without having access to the keys to decrypt them. If that's the case, the laptop is no more a risk than the git remote.
The series of articles the author intends to write are a great idea. Managing secrets is a fast growing problem that needs more exploration.
However, I must say I disagree with the sentiment that separation of secrets from source code is a bad thing. Git-crypt and similar tools use git for versioning. While this sounds great, it is not desirable for key management. Software and secrets have different management cycles. You always want to keep a copy of past software versions, but this is not the case for secrets. For instance, what if you want to make sure a secret is deleted? It will be a hard challenge to remove this from the git history on all copies of the repo.
As commented before, separating your development workflow from your secret management flow is actually a must. Not only are the management cycles different, the access policies also differ. Giving only a few trusted individuals access to the encrypted bag inside your git repo may work for a very small team with a few similar servers. However, when you have multiple sets of people and services that need access to different sets of secrets, controlling who has access to what secrets with these encrypted data bags quickly becomes impractical.
I agree with tptacek, having a decoupled and secure place to manage and distribute secrets is more secure and scales better. At SecretHub we allow access control per secret or secret-group to solve this complex mapping problem. We believe it should be easy for developers to create secrets, but only easy for machines to use them. Developers rarely need access to secrets in production.
Several years ago, I worked on a proof of concept for storing secrets in a git repository (making use of smudge and clean config options) where each secret was encrypted using the public ssh keys (of each server's host key pair) of the servers that would be deployed to.
Once the code was pushed to those servers, the clean filter would use the private ssh key of the host key pair to decrypt the secrets.
Vault from hashicorp was built to solve the use case described in the article.
However, encrypting source code, in general, especially from hosting services like github is still a valid use case for many scenarios. GitZero attempts to solve it, but it provides no specific guarantees that the source code (for free tier) is inaccessible to the hosting service
No mention of blackbox by stackoverflow? On a phone right now but it is pretty useful for this type of stuff. The only thing bad about it is key management and new developers have to generate their own keys.
Instead, keep secrets "out of band" and supply them to applications as part of your deployment process.