Hacker News new | past | comments | ask | show | jobs | submit login
Practical homomorphic encryption over integers (2017) (arxiv.org)
97 points by hamilyon2 on July 23, 2018 | hide | past | favorite | 53 comments



This is one of the current 'tech' trends that I'm following, the idea of homomorphic encryption is really cool. I feel like there is a lot of real world applications for it, but I have failed to find them yet.


Cloud computing is the big one. You might want to rent some remote computer time without sending plaintext data to a third party. Homomorphic encryption lets you upload encrypted data and instructions for a third party to operate on it without having to decrypt it first.

Edit: Try Greg Egan's book Permutation City.


If we had proper homorphic encryption, we could use regular home user machines for a cloud, airbnb style. You can't do that today because the security risks are completely unmanageable. But that all changes with homomorphic encryption.

Today it's not practical, but we see advances like this every year, maybe in a decade it will be practical.


You need more than FHE for that; just because I cannot see what I am computing does not mean I will compute the answer correctly. There are ways this can be resolved but it pushes practicality even further back.


Is it not practical to just saddle the dataset and compute request with a canary payload that is easily computable by the requestor?

If one can spoof the canary payload effectively, one would have broken the FHE scheme, probabilistically, right?

Unless I'm thinking about this wrong, the FH part of FHE makes this a pretty solvable problem. Is this not already fundamental to any FHE scheme?


Maybe in some cases that is possible, but it is also possible that your function requires as much time to verify as the FHE evaluator would spend on (honestly) computing the result. In other words it would make outsourcing pointless; this is briefly discussed in the related work section of this paper:

https://eprint.iacr.org/2014/202.pdf

You also have to be careful to ensure that the canary cannot be identified in the plaintext; otherwise the evaluator can homomorphically identify the canary (i.e. it can compute the canary values honestly and cheat everywhere else).


Not sure what you mean? Correct evaluation (modulo some negligible error) is part of the usual FHE definition.


You can't evaluate the correctness/trust the result without performing the computation yourself or receiving a proof of correctness, e.g. a SNARK.


Perhaps it’s referring to human malice rather than mathematical error: that we’d want some sort of certification that the operation performed off-site was actually the one requested?


Enclaves (e.g. SGX) get you there without the need for fancy math. And you get to maintain normal computation (i.e. x86).

Enclaves have the downside of being a bit of a pain to use. But hell, FHE isn’t any easier.


It is somewhat doubtful, though, that SGX achieves what it was supposed to achieve. While you get full-RAM encryption & integrity, the trouble is that CPUs tend to only work-as-defined for a narrow range of environmental parameters. In all likelihood, a malicious cloud provider can take a CPU out of spec, at which point pretty much all security guarantees go out of the window...


My understanding was that if you take the CPU out of spec (e.g. voltage) while in an enclave, you’ll get MCEs and the CPU either nukes your enclave or forcibly reboots (idr which).

That said you can still shave it down and start FIBing if you have the $$$.


With SGX you don't have to trust the cloud provider, but you still have to trust Intel. Of course, you're probably trusting Intel anyway, so this is perhaps just a theoretical concern.


With homomorphic encryption, you have to trust Intel to build CPUs that calculate correctly. Except for FDIV style bugs (which are pretty rare, and it's telling that FDIV, from the mid-90s is the go-to example), they mostly managed to maintain that trust.

With SGX you have to trust Intel to build CPUs that isolate securely. Meltdown, Spectre (multiple variants), bugs in Intel TXT and ME, and that's only some of the headline issues from the last 4 years.


...and if there are bugs, with SGX they could expose your secrets. With homomorphic encryption they should just turn your results into detectable garbage on your end, I'd expect.


But if you test 5 + 3 = 8 with the current key, wouldn't you then be able to detect a similar pattern in the data you are processing ?


I don't think the data processor would have the key. They'd receive the encrypted inputs and the operation, and return the result.


I have zero knowledge in the field so all my questions are quite naive.

But the encryption process is public right ? So you can encrypt just like any client ?


I really want to sarcastically say "blockchain homomorphic encryption", but, well, it's not necessarily a terrible idea, honestly. One can imagine some combination of the primitives in which you could prove that you added here the same amount you subtracted from there, that neither total is below zero, but for which the number actually transferred is encrypted. The current use case for something like BitCoin is often security focused, but there's also simply a privacy concern that could be addressed there.


> One can imagine some combination of the primitives in which you could prove that you added here the same amount you subtracted from there, that neither total is below zero, but for which the number actually transferred is encrypted

You are exactly describing Mimblewimble.

https://github.com/ignopeverell/grin/blob/master/doc/intro.m...


There's already practical cryptography to do what you just described called Confidential Transactions. It's not true homomorphic encryption, but it does securely hide the amount transferred on the blockchain while still guaranteeing that a legal transaction occured (no money created or destroyed)


Hey, we're working on blockchain "somewhat homomorphic encryption" (SHE) via multiparty computation. In particular, a privacy layer for Ethereum.


I implemented this, on top of Fabric however, and with a semi-trusted controller node, for a rather large industry. Was a lot of fun, and is being used in production. You can get quite far with SHE.


You should look into Enigma (blockchain company). They use this type of technology to enable private decentralized multi-party computation.


I think this is what zcash already does.


zCash uses zk-SNARKs for secret transactions. Which are awesome. But homomorphic encryption allows general-purpose computation.


These fully homomorphic encryption systems can do boolean circuits, which are no more powerful than the arithmetic circuits you get with zk-SNARKs (in fact, are an instance of arithmetic circuits). And with some zk-SNARKs, you get recursive composition, which lets you do arbitrarily many steps of a transition function expressed as a zk-SNARK, which is just as general purpose (albeit more involved) as applying a bunch of boolean circuits to some data.


The "Confidential Transactions" scheme for cryptocurrencies uses homomorphic encryption.

The idea is that you can prove that the encrypted input and output amounts in a transaction balance to zero, without having to actually reveal what any of those amounts are.

See https://people.xiph.org/~greg/confidential_values.txt


CTs don't use homomorphic encryption. They use "additively homomorphic commitments" which are, essentially, cryptographic proofs.


There are certainly real-world applications, just not with the current level of overhead that known algorithms require.

The real-world use case is being able upload data to a cloud provider and do queries and computation on it without the provider ever knowing what's inside. e.g. "How much did I spend last month?"


One of the side projects I have is [FHE-CPU](https://github.com/CapacitorSet/FHE-CPU), an effort to make general-purpose homomorphic computation possible with a Moxie-like CPU, but it's a daunting effort to even get the basics working.


One of the big ones is internet searches. Also processing PII/HIPAA data.


How does this affect internet searches?


Potentially a search engine could provide results without knowing the actual query, although that would make it difficult to identify new trends or determine which results are/aren't being clicked, so it probably wouldn't be competitive as a general purpose search engine.


It would also be completely impractical, since the search engine would necessarily have to perform some sort of scan of its entire database to respond to every query (otherwise it would, in fact, learn something about the query).


Unless the database is also encrypted and provides the same homomorphism against reverse index queries.


In which case you are no longer talking about FHE. With FHE the server receives a ciphertext query and produces a ciphertext output, without ever seeing any plaintexts (not even the result of a reverse index query).


If the database was encrypted, you would need a different encrypted database for every user - again impractical.


One of the things another team at my organisation is working on is using it so that competitors can pool their data to do better analytics over the whole corpus without actually revealing their secrets to the other company - https://www.n1analytics.com/.


The "textbook" application is spam filtering for encrypted mail.


Homomorphic encryption is cool, but it's not the sort of thing most developers will want to use in their apps. Most cryptography failures (BEAST, CRIME, the Apple iMessage attack, etc.) that allow plaintext recovery are the result of chosen-ciphertext attacks.

You want authenticated encryption instead. https://tonyarcieri.com/all-the-crypto-code-youve-ever-writt...

You can sort of build something authenticated on top of a homomorphic cryptosystem, but it's kind of a hack: https://paragonie.com/blog/2017/12/assuring-ciphertext-integ...


It's not something most developers will want to use in their apps for application security, but it is absolutely something that folks may want to use when training machine learning models based on customer data. As I understand it, this is one of the big "pie in the sky" goals for homomorphic encryption—having an encrypted model trained on encrypted data that only the customer can use for inference.


What exactly is homomorphic encryption?


You encrypt some data and send it to Bob.

Bob does some computations on the encrypted data and sends you the (still-encrypted) results.

You decrypt the results to get the answer of your computation. Bob never learns what your data is or what the results are.

The term "homomorphic" roughly refers to the fact that the encrypt/decrypt functions go "outside" the computation. That is, if Bob is applying the function f, we have f(Encrypt(data)) = Encrypt(f(data)). The left side is what Bob does, the right side is what you want to get (because you can decrypt it).

EDIT. To see how cool this is, think about this: I have two numbers, I encrypt them to form long strings of gibberish. Then I have Bob perform the "multiplication" function on the gibberish and send me the result, and I am able to decrypt that to get the result of multiplying the original two numbers. If that doesn't impress you, I send Bob my database of encrypted emails, then ask him to do a string lookup for "chocolate", he sends me back the set of matching emails without ever knowing what they say or what string I looked for.


Does it generalize? Can you in theory perform any computation in this way?


It does generalize, but not all the way. You can't perform while loops (or anything with unbounded runtime) or operate on anything with unbounded (secret) size, so it isn't Turing complete.

You can do anything expressible as a bounded-depth (non-cyclic) circut of eg NAND gates. You can also do something like [a][+ b][+ c][+ d], where each [] is a new homomorphic operation, which gets around the bounded-size problem somewhat, but the attacker can obviously see how many chunks you're working on, just not the content, which provides some traffic analysis vulnerabilities.

All in all, it's a very useful tool but I don't really like the way people keep presenting it a silver bullet, like "yay, once we get this working we won't have to care that the cloud is full of phantom trolleys armed with hammers!".


Even with partially homomorphic encryption and additional "privacy preserving protocols" you can carry out pretty general computation tasks such as machine learning. Have a look at this blog post using the Paillier cryptosystem for a federated linear regression - https://blog.n1analytics.com/distributed-machine-learning-an...


To the extent that computations are feasible, yes. There was a pretty big paper like a decade ago proposing a fully homomorphic system. It was pretty much impractical, running like a million times slower than native instructions. I assume without reading that this post's paper reduces that multiplier to hundreds of thousands.

edit: reading the abstract, it looks like they don't have a faster fully homomorphic system, just some better results in the partial homomorphic domains.


Actually the paper presents nothing; their security proofs make no sense at all and the rest of the paper is not much better.


Wow. Explain?


Theorem 1 fails to specify what it means to "attack" the cryptosystem; it seems to deal only with complete plaintext recovery. It is possible that an attack can only recover a single bit of the plaintext, something which is not handled by the proof.

Definition 1 is not really a definition; in particular it would not be useful in a proof or logical argument. Likewise with Definition 2.

The authors claim that chosen plaintext attacks are not relevant; then they claim in Theorem 3 that their system is secure against CPA. Over and over in this paper the authors refer to the need to be CPA secure when the plaintext has "insufficient entropy" so it is hard to understand why they would claim CPA security is irrelevant.

The vector version of their scheme appears to be a lattice problem, but the authors do not discuss lattice attacks that might be used against their scheme. The authors state that it is "clear" that the security of the vector version follows from the same arguments used for the integer version.

In the "FHE" section the authors do not actually construct an FHE scheme; instead they have constructed some kind of garbled circuit scheme that uses the encryption schemes proposed in the paper. No proof of security is given for that garbling scheme.

For what it's worth, this is more or less what I would have written if I had to review this for a conference and I would give this paper a "strong reject" score.


A homomorphism between two sets preserves relations between elements.

A set with a single operation (plus some extra properties) is called a group. The integers with addition (+) is an example of a group.

Suppose I have groups X and Y. Then a group homomorphism, h, is a mapping at that preserves the + in the two groups, so

h(x +_X y)=h(x) +_Y h(y)

Where +_X is addition in X and +_Y is addition in Y.

Homomorphic encryption means that what you used to do you encryption is a homomorphism.


Encryption while keeping the "shape"/"form" of the data intact. This way you can have secure third-party computation, without that third-party knowing exactly what data it is operating on.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: