After spending time at a London fintech accelerator last year, enterprise database startup ZeroDB scrapped its first business plan and mapped out a new one. By January this year it had a new name: NuCypher. It was no longer going to try to persuade enterprises to switch out their Oracle databases — but rather to sell them on a specialized encryption layer to enhance their ability to perform big data analytics by tapping into the cloud. Its slogan: body armor for big data.
Today it’s launching an open source version of its general release product here at TechCrunch Disrupt New York. At this point, the almost 1.5-year-old startup is also running a handful of pilots with major banks, says co-founder MacLane Wilkison.
“It’s a combination of cloud and big data,” he says of the underlying drivers which the team reckons are creating a need for the technology. “Now all of a sudden you’re working in computing environments that are distributed across hundreds or thousands of machines, and that could be spanning both some on-prem, some private and even public cloud. And that sort of scenario presents a lot of new and different security challenges.”
Instead of building an open source end-to-end encrypted database, NuCypher is selling a proxy re-encryption platform for corporates with large amounts of sensitive data stored in encrypted databases to let them securely tap into the power of cloud computing. An idea that might need a bit of explaining to appreciate, but one that’s grounded in a genuine need — at least based on what NuCypher’s early banking partners are telling it.
On the competitors front Wilkison names the likes of HP-owned Voltage and Protegrity as the largest existing players in the space. Albeit, he says they’re both doing tokenization of data, whereas NuCypher reckons proxy re-encryption technology offers greater security for certain types of data.
Unlike some other approaches to processing big data in the cloud, he emphasizes that NuCypher is not using tokenization to mask any data — arguing this is necessary for the target customers because certain types of data when masked with tokens can be vulnerable to statistical attacks.
While proxy re-encryption is an existing area of cryptography, applying it to big data is what’s novel here, according to Wilkison, who says the tech has mostly been used in academia thus far. “We’re the only people that applied it to big data platforms like Hadoop and Spark,” he says. “As far as I know we’re the only one using proxy re-encryption in business.”
So while the team’s early ideas focused mostly on looking at data archiving and encryption to enable banks to make use of cloud storage, he says the business was pulled onto its current rails after banks asked if they could apply the encryption tech the team had been building for data archiving to big data cloud processing.
Safe to say, this mini pivot is a familiar story for enterprise startups — after all, who knows better the business needs than the target customers?
“When we originally started the company, my co-founder and I had built an open-source database and then an encrypted database that allows you to operate unencrypted data without sharing encryption keys with the database server… What the banks were particularly interested in was taking some of what we had built for that and applying it to more compute-heavy type of workloads,” says Wilkison.
“After a period of talking to customers… we took some of what we had built for that and made it into a more generalized encryption layer for different platforms — specifically for the big data space. So Hadoop, Kafka and Spark.”
So what is proxy re-encryption — aka NuCypher’s “secret sauce,” as Wilkison puts it — and why is the technique useful for banks?
“Proxy re-encryption is a set of encryption algorithms that allow you to transform encrypted data. Specifically… it allows you to re-encrypt data — so you have data that’s encrypted under one set of keys, you can re-encrypt the data without de-encrypting it first, so that now it’s encrypted under a second, different set of keys,” is how Wilkison explains it.
He gives the example of a person who has some encrypted files stored in Dropbox. If they want to share the files with someone else that could be achieved by downloading them, decrypting them with their key and then re-encrypting them with the public key of the person they want to share with. But obviously — at scale — that’s a pretty network-intensive and cumbersome process.
Even more naively, this person could just share their private encryption key with the person they want to share the file with. But then they’re abandoning all control of their security.
Clearly neither scenario is ideal for NuCypher’s target customers — with their vast lakes of sensitive, highly regulated data. This is where NuCypher reckons proxy re-encryption can step in to offer an edge.
“What I can do with proxy re-encryption that’s much more elegant and secure than either of those alternatives is I can basically delegate access to my encrypted data to someone else’s public key,” he adds.
The platform creates a re-encryption token off of the public key of the entity with whom its customers wants to share data. That token can then be uploaded to the cloud where the third party can access it — in turn enabling them to decrypt and access the data.
Wilkison says re-encrypted tokens can be created and used to delegate access to “as many people as I like.”
Ensuring compliance with regulations around the processing of sensitive data — data such as a bank or healthcare company might hold — is one key selling point for the platform.
He points to a regulation like HIPAA, which sets standards for protecting healthcare data, as one example where a lot of care is needed when handling data to ensure compliance. He also flags up the European Union’s incoming GDPR (General Data Protection Regulation), which ramps up penalties for violations of rules on processing citizens’ personal data, as another instance of data-centric laws creating data processing pain-points that NuCypher’s platform is setting out to fix.
Other target data-laden industries could include telecoms and insurance, though the team has kicked-off focusing on financial services, and the current pilot phase of the platform is with “major banks.”
Wilkison says there are essentially three main use-cases for the platform:
- “cloud enablement” — so giving target customers a way to move their on-premise Hadoop big data workloads to the public cloud and make use of services like AWS, particularly for “burst or transient workloads.” “What we do there is give them a way to keep their encryption keys in their own data centers, under their control so they can use the crowd to store and process data but they don’t necessarily have to trust the crowd with their encryption keys,” he adds.
- “regulatory compliance” — currently NuCypher is working with customers in the U.S. and Europe needing to comply with regulations such as HIPAA, PCI, GDPR and PSD2.
- “secure sharing of sensitive encrypted data” — with multiple third parties, be it a customer, partner, supplier or even a regulator. On this he also notes one of the benefits is that the system segregates the data and the encryption keys — which means, for example, a regulator could not subpoena the cloud provider in order to get their hands on the decrypted data.”It’s very important, particularly in financial services, for customers to have that segmentation between the data and the keys,” he adds.
Another benefit he notes is that NuCypher’s proxy re-encryption technology enables it to give customers the ability to manage access controls without needing to provide full access to the data — which means it can remove any single point of failure (i.e. via an admin who has to have full access control to all of the data).
“With NuCypher a hacker would have to hack into each node individually in order to get all the data,” he adds.
Given the complexities of the technology, customer education is clearly one of the big challenges, with Wilkison saying this boils down to explaining how the approach differs from standard encryption.
And on that front, he says one selling point for the platform is that the proxy re-encryption tech works with NIST standardized encryption algorithms. Which means NuCypher customers don’t have to abandon the tried and tested encryption algorithms they’re comfortable using, such as AES-256, in order to make use of the tech.
“That was one of the pieces that we added that took a pretty significant amount of research to develop for us — to get proxy re-encryption to work with things like ECIES, which is a standard elliptic curve, NIST-certified,” he notes. “So we can go to a customer and say, everything that we’re doing on a crypto level is very standardized, very well understood by industry. So they’re not having to rely on newly rolled crypto.”
NuCypher’s platform exists as an SDK and an encryption library, so its business model is licensing the software — it’s not hosting any data itself, confirms Wilkison; customers can install the software on premise, such as within an existing Hadoop deployment, or directly in the cloud on the infrastructure they’re managing.
Funding-wise, the team has raised a $750,000 seed round to date, from Valley investors including Base Ventures, NewGen Capital and some angels. It also went through Y Combinator last summer. Wilkison says it will be looking to raise again in Q3 this year.
How big do they reckon this market is? Wilkison says he’s hoping the current six to seven pilot customers of NuCypher will turn into “high double digit” or maybe “low triple digits” in a year’s time. But with those target large enterprises typically spending vast amounts of money on securely storing the sensitive data they’re entrusted with, there’s also a very sizeable incentive for them to shift some of that compute load into the cloud. And, potentially, a lot of money at stake if NuCypher can convince them to buy in.