Does Git need encryption?
Git will soon support signing commits and tags with SSH keys. This is great news for the common developer who doesn't bother with PGP. It also brought back a thought I often have: Does Git need encryption?
A common issue I have with Git is the need to trust the remote server. I can create "Private" repositories, but nothing about them is private. At best they are "Not Public". GitHub can read my data. GitLab can read my data. They all can. And if there's a leak, that's even worse.
I could self-host a server, but that's a pain and doesn't make the solution easily available. So the question is: do we need a general solution baked into Git? What would that look like?
I only have a high-level understanding of how Git works and various encryption strategies. So I'm missing details. I'm sharing my current thoughts so that I can draw a clearer conclusion.
The first step is to change
git init. Based on your
git config, it will lookup an encryption key, if there isn't one it will ask for it. This is the repository's encryption key. Most data in
.git is encrypted using it.
Anyone wanting to read encrypted data will need the encryption key. It may be possible to run some git commands without it where file contents isn't vital and hashes are enough.
When pushing, the server doesn't need the key, it will sync the repo as-is in its encrypted state.
So how does the server handle things like conflicts? Hashes can be still available which I think is enough, but probably isn't. Everything else is encrypted. Files, filenames, messages, tags, etc.
Whenever something needs to be shown to the user, the data is sent back in its encrypted form and decrypted client-side.
How would web clients work? It's nice to browse a repo on the web without cloning it. And you still can, with client-side decryption. Of course, there's a risk that websites can grab your key. So web browsers need to provide a place to securely store keys and for web services to decrypt and verify data without revealing the key. This feature would also work nicely with WebVerify.
What happens when a private key leaks? The repository will need to be re-encrypted with a new key. The key can be shared so others can re-encrypt on their end or clone a new copy of the repository.
To make things less disruptive, each contributor can have their own keys. Repositories can store multiple public keys for those who have access. This list serves as access control. Remove a key and re-encrypt to remove access. Do the reverse to give access.
At this point, things are getting complicated. In a large team and repository, client-side synchronisation and re-encryption will add a lot of overhead. Having each contributor use their own keys will also complicate other processes.
git-crypt lets you encrypt specific files defined in
.gitattributes using PGP. There are a few other tools which do something similar.
It looks like Fossil supports encryption but it involves buying an SQLite extension and creating a custom build.
Right now, as far as I can tell, adding encryption to Git is possible and won't break any of its workflows. However for larger teams and projects, it will get slow and complicated.
It's probably best to leave Git focused on open-source workflows and use something else for private repositories. Mixing privacy and openness is bound to leave gaps.
Thanks for reading.