Wasn't really sure how to make the title descriptive enough without putting the full question there.
Anyway, I have been toying with the idea of creating an open-source web application. Open-source in the way that I would host the product on my own servers but the full source code, except, of course, authentication info, tokens and things which shouldn't be under source control anyway, would be available on Github. Anyone would also be able to set up a self-hosted version of the product, and if anyone wanted to contribute, I would accept pull requests, etc.
The idea behind that was that, since it would be targeted mostly at a tech-savy crowd and deals with personal information, I would like to introduce some level of trust that I'm not doing anything sneaky or unexpected behind the scenes (like storing information I shouldn't be).
So basically I started wondering if it is possible to implement a way people could verify that the same code they see on the Github repo is the code that's also running on the live hosted site? I will be working with node.js, but I don't think the tech stack is too relevant here.
The client-side part, as I imagined it, would be relatively trivial to verify - run the build part locally, compare fingerprint of the live code and the local version.
But my idea got stuck on the server side. Since there, even if I would make a server side endpoint that returns the fingerprint of the live code, there is no way for someone to check what is actually going on, I could just as well return a static file with a hardcoded fingerprint.
I'm sure someone has dealt with or thought about similar problems before, and I would be happy to hear some insights. Feels like this might be a pipe dream, but at least some level of verification would be nice to achieve.
We've been here before: this is Trusted Computing. You need a Trusted Platform Module on your servers (thankfully you're picking the hardware, so you can make that a hard requirement). Your users can inspect and sign your code with their keys, that they generate and keep on the client side (you never see them). Or more likely, they sign that they trust a particular third-party auditor. Either way, their data is uploaded encrypted with their keys and only code they have signed will ever be allowed to decrypt it.
It won't be easy. You'll have to keep old versions of your code around in case users haven't signed the new versions. The TPM-handling libraries are immature, though they get better every day. But it's possible, particularly since you only need to make it work with one particular model of TPM.
Good luck!