Keeping data confidential with fully homomorphic encryption.
Dave Bittner: Hello everyone, and welcome to the CyberWire's Research Saturday. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities, solving some of the hard problems of protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.
Rosario Cammarota: So, fully homomorphic encryption is an encryption technique, but unlike the type of encryption that we use right now, homomorphic encryption allows to keep the confidentiality of data while data is being in use.
Dave Bittner: That's Dr. Rosario Cammarota. He's a principal engineer at Intel Labs. The research we're discussing today is titled, "Confidential Computing: Advances in Federated Learning and Fully Homomorphic Encryption."
Rosario Cammarota: When a message is encrypted into ciphertext, which we will refer to as a "cryptogram," right now, if the cryptogram is homomorphically encrypted, you can actually manipulate its content without decrypting it. And what's different with homomorphic encryption, what homomorphic encryption adds to what we do right now, is that homomorphic encryption allows to keep the confidentiality of data while data is being in use, because you can compute on the content of the cryptogram without decryption.
Dave Bittner: So, give me an example of where this would apply – what's the use case for this?
Rosario Cammarota: Nowadays, two of the main emerging areas that we are seeing today are data collaborations and intelligent automation that relies on data collaborations to perform automatically more and more intelligent and personalized decisions based on data extracted from patterns. So, when collaboration happens across mistrusted entities – basically, these entities aim to collaborate more and more, then there is the problem of, can we share the data? How do we share the data? What data do we share? And part of the roadblocks in data sharing concern privacy, because much of the digital data out of which you would like to extract patterns include sensitive and private data.
Dave Bittner: So, we're talking about, potentially, could that include things like medical information?
Rosario Cammarota: Absolutely. If you think, for example, to automation in the medical space, let's think, for example, to a tumor segmentation model that is served in the cloud. OK, what that helps to do is to increase the rate of scans that you can analyze. And that's very important because timeliness in that context may save lives. So, now the problem there is that if you are outsourcing scans to a service that is deployed on the cloud, you need to protect the privacy of these scans. And when we are talking about privacy, definitely we'll have the following two things: so one is basically the association of the scan with the patient, and the other is the result of the analysis.
Dave Bittner: Well, let me ask you sort of a basic and perhaps a question that demonstrates my ignorance when it comes to the topic. So, we're talking about fully homomorphic encryption – is there partially homomorphic encryption?
Rosario Cammarota: Yes, actually, that's an excellent question.
Dave Bittner: Oh, good. (Laughs)
Rosario Cammarota: There are many flavors of it. There is partial homomorphic encryption, there is something else that is somewhat homomorphic encryption, and then there is fully homomorphic encryption.
Dave Bittner: OK.
Rosario Cammarota: Let me tell you a little bit very briefly about the difference between those. With partial homomorphic encryption, you can basically perform only one type of operations on cryptograms. So it's either additions or multiplications. With the somewhat homomorphic encryption, you can perform both additions and multiplications, but for functions up to a certain complexity. And in fact, when you have a crypto system that allows to perform operations on cryptograms and it can perform both additions and multiplications, the first question that you ask is this fully homomorphic encryption? And then the answer usually is, it's somewhat, because you can only handle up to a certain complexity. Fully homomorphic encryption extends, and the majority of the constructions that are known today, somewhat homomorphic encryption schemes with the ability of performing arbitrary computation of arbitrarily complex functions.
Dave Bittner: Now, my understanding is that this is very computationally complex, correct?
Rosario Cammarota: Yes, it is.
Dave Bittner: And that's a barrier for adoption?
Rosario Cammarota: It is one of the barriers for adoption, yes. So to speak, any encryption technique, the encryption process is – in any encryption technique, the encryption process is inherently inefficient. What that means is that there is an expansion of the original data type size when you generate the cryptograms. In homomorphic encryption, expansion can be a hundred to thousand times – can generate a hundred to thousand times larger cryptograms. And if you think to handle this type of data on existing platforms, you start already having an idea of how even doing simple computation on a very large cryptograms can be more stressful with respect to both computational resources, memory management, and communication between the host processor and the computational resources – basically memory transfer.
Dave Bittner: You know, I grew up – when I was a kid, I remember it was when the Rubik's Cube first came out. And everyone was fascinated with it, it was a big hit. And there were books that you could buy to help you solve – if you wanted to learn how to solve a Rubik's Cube, there were books that had step by step instructions. And, you know, in the early days, those books might take you a half an hour or so to solve a Rubik's Cube. These days, if you go on YouTube, you can see, you know, these kids today are solving Rubik's Cube in seconds. And I think a big part of that is that over time, the algorithms have gotten so much more efficient when it comes to being able to do that. Is that sort of thing happening with fully homomorphic encryption as well? Are researchers like you and the folks at Intel Labs, you know, clever humans who are banging away at this, are you coming up with more efficient ways to come at this problem?
Rosario Cammarota: Huh, so, that's very interesting. That's a very interesting question as well. Crypto systems usually are designed to protect the data for a certain amount of time. And so, homomorphic encryption, as a crypto system by itself, is being designed for the same purpose. And, so to speak, the complexity that is required to break the crypto systems is usually very high, even at the lowest level of compliance when you deploy crypto systems, such that in ten years, with the majority of – with all the resources that you have available right now, or more than ten years, you want to be able basically to break the crypto system.
Rosario Cammarota: Now, for what concerns homomorphic encryption, homomorphic encryption has an additional property in terms of protection because it's foundationally based on mathematics that would be resistant even against the crypto analysis with quantum algorithms, that is going to be the next type of big threat to the current cryptography.
Dave Bittner: Hmm. What about on the hardware side of things? I mean, obviously, you know, Intel is a big innovator and manufacturer of processing hardware as well, and we've been seeing this trend over the past few years of having, you know, dedicated parts of chips that are designed to do difficult things in a very efficient way. Is this an area of research as well where we could see, you know, certain types of hardware that were dedicated to this task?
Rosario Cammarota: Yes. So, the main driver toward the specialization of a hardware are toward very specific tasks. So, one example that come to mind in modern days is basically specialized hardware for artificial intelligence. It's to make sure that your hardware can run the tasks very, very specifically, keeping in mind that your task is processing certain data types.
Rosario Cammarota: In this case, when we go to cryptography, there are already instances of accelerators that are more suitable than general-purpose hardware to execute cryptography. And in fact, even within a processor, you may see that there are instructions and extensions that are dedicated to process cryptograms for the cryptography that is deployed nowadays. Now, similarly, for homomorphic encryption, being mindful that the cryptograms are a lot more complex, you would need some form of specialized hardware to reduce all the computational overhead that you mentioned earlier.
Dave Bittner: What about the larger world of research when it comes to these sorts of things? I'm thinking of, you know, establishing standards for this. Where are we in terms of standards bodies and being sure that, you know, these sorts of encryption methods can be used broadly?
Rosario Cammarota: Yes, so there have been a group participated by universities and industry school at homomorphicencryption.org that started basically to lay out the foundational work for the standardization in terms of security parameters. So, as we know, any crypto system is something that is parameterized to some secrets. And the language of the secret, so to speak, grossly indicates the resistance of the crypto system to algebraic attacks. Now, what happens is that for the mathematics that is below crypto systems that allow you to compute unencrypted data, this group has been looking into the security of instantiation of the mathematical fields underneath this cryptography. And very recently, we started exporting basically this work and making it more visible to the global community by working with the international standards. It is very important, and I would say it's fundamental for the whole industry to have standards about cryptography, as you correctly point out, and that basically includes best practices. What is the best selection of the parameters for certain use cases?
Rosario Cammarota: But one difference that makes homomorphic encryption unique is that, unlike traditional cryptography, in homomorphic encryption there is an entanglement between the application domain, the workload, and the cryptography itself that otherwise would not be connected together. And the reason for that is because you are computing on encrypted data. So the standards, in part, is application domain, plus cryptography, together.
Dave Bittner: Help me understand, is there a concern that folks may be able to infer the data from the calculations they're doing on the data?
Rosario Cammarota: No, for two reasons. What you can infer during homomorphic encryption operations with traditional methods, basically to link data, is a ciphertext by itself. And so, and the fact that you are using an homomorphic encryption system is an additional advantage that you don't need to store decryption keys on the system, which is an additional kind of target of attacks. So the only information that an attacker would gain by introducing – by monitoring the channel, so to speak, would be ciphertext. It can use that ciphertext, but it cannot look into it.
Rosario Cammarota: For what concerns looking at the output of a computation, homomorphic encryption systems, the encryption procedure is inherently non-deterministic. And so what that means is that if you encrypted the same data twice and then you process this data, the output of the competition is different. It is encrypted, but is also different. So it is this property that disambiguate, so to speak, inferring the result of the operations and also inherently protect the intermediate data.
Dave Bittner: Wow. Well, as you look towards the future, I mean, as this technology makes its way down and becomes more practical for everyday use and there are broader applications, as we're able to make use of it, as the both the hardware and the developments that folks like you are working on, how do you see that affecting us in day-to-day lives? What are the advantages when it comes to privacy and security that folks are going to see as a result of this making its way out into the general use?
Rosario Cammarota: Yeah, let me give you an example that clarifies things. So, currently, when we go around with our mobile devices and we enter an environment that is progressively smarter, one thing that happened or that we should start seeing more and more is that we are going to receive personalized information from that environment, and either in our mobile phone or other gadgets that basically interact with the environment, the environment becomes a cyberphysical system, so to speak, in it's intelligence because there is all this machine learning.
Rosario Cammarota: Now, in order to provide you a personalized recommendation which is supposed to do good for you, the system needs to ingest some of the information that you are carrying with you, such as your location, if you are making a transaction, your credit card information, other aspects of the transaction, what you have purchased, why you should be looking into another shelf within the same store, because there is something that potentially is going to help you, where you should shop today, all these types of things. So, in order to perform that personalization, the system that is performing this type of computation needs to consume your data. With homomorphic encryption, it will be able to consume the data without actually seeing the data. So any unintended use of your data, potentially, cannot happen. And so you are receiving the personalization, but you are not giving up your data.
Dave Bittner: For you personally, it sounds like this stuff is a lot of fun. I mean, it seems like, you know, you and your team there at Intel Labs, this is the kind of – you know, it may be baffling for folks like me who are more mathematically challenged, but it does seem like, you know, these challenges – it is a lot of fun for you and your team, isn't it?
Rosario Cammarota: It is. It is. There are many challenges behind it. Some are from – on the mathematical side, the research around homomorphic encryption is still progressing. And in fact, we do have several key players at a universities worldwide to continue making research for making homomorphic encryption systems more efficient from an algorithmic perspective while retaining the same level of security. That part is actually really hard, but at the same time is really challenging.
Rosario Cammarota: Now, let me give you the perspective of a person that also sits within the semiconductor industry. We talked about, you know, how processing these cryptograms is actually challenging primarily because of their size, but also because the operations that you do in order to manipulate the content of cryptograms is also more complex than just doing addition and multiplications on plaintext data, right? So, when you actually envision a basically a computer architecture that immediately can process these cryptograms, a lot of challenges emerge because of how different is the cryptogram from the native data types that we are used to seeing nowadays.
Rosario Cammarota: So there are a lot of challenges and a lot of excitement from the point of view of the technology. There is excitement in the ecosystem because applications of this technology can benefit humanity. And that's the part, since you asked personally, yes, it is fun, but the real goal is that, well, if we make it happen, humanity benefits from it. And that aspect is fulfilling. It's one of the missions, actually, that we had in the Intel Labs, as a research lab, have and pursue as we keep doing research.
Dave Bittner: Our thanks to Dr. Rosario Cammarota for joining us. The research is titled, "Confidential Computing: Advances in Federated Learning and Fully Homomorphic Encryption." We'll have a link in the show notes.
Dave Bittner: The CyberWire Research Saturday is proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies. Our amazing CyberWire team is Elliott Peltzman, Puru Prakash, Kelsea Bond, Tim Nodar, Joe Carrigan, Carole Theriault, Ben Yelin, Nick Veliky, Gina Johnson, Bennett Moe, Chris Russell, John Petrik, Jennifer Eiben, Rick Howard, Peter Kilpe,and I'm Dave Bittner. Thanks for listening.