# Proper key management in the cloud with a Cloud Secure Module
Jurjen N.E. Bos | 22 Min To Read | 07 Jan, 2025 | #Cybersecurity
=>
https://blog.worldline.tech/2025/01/07/cloud-transformation.html Article on Worldline Tech Blog
## Introduction
Cryptography-dependent applications, such as payment applications, depend on HSMs (Hardware Security Modules) to securely store cryptographic keys and perform cryptographic operations. The security of these HSMs depends mainly on the **key management process**: that is the process where our key custodians update the set of keys in the HSMs. If these applications are being moved to the cloud, it is not sufficient to just move the HSMs to the cloud as well, because the key custodian can no longer perform key management. The key management process of the cloud provider is insufficient to guarantee the security of the keys. In this article, I explain why this is the case, and I propose an approach to solve this problem.
Before explaining what problems are solved, and how these are solved, let’s have a good look at what HSMs do.
### What is an HSM?
An HSM (Hardware Security Module) is a “computer in a vault”. The cryptographic keys are stored in internal tamper protected memory and the computations with these keys are performed inside, by the HSM itself, so that keys never have to leave the device.
Performing cryptographic operations is not the main function of an HSM. Most modern microprocessors are quite good at doing these calculations themselves and even have specific instructions for doing fast encryptions. Instead, the main function of the HSM is to **keep the keys secret**. Keeping keys secret can be divided into:
* Physical protections, such as a strong construction and
tamper detection mechanisms that destroy the keys if you try to break it.
* Logical protections. There is no command to extract the HSMs keys. No
combination of command can be used to get information about the keys.
(This is not as easy as it sounds. For example, the often-used PKCS#11
standard for communication with HSMs is notorious for its security flaws.)
=>
https://blog.worldline.tech/images/post/cloud-transformation/adyton.png Adyton [image]
The picture below is an “Adyton” HSM that is used in many places in our company.
### Key management
The security of HSMs does not only depend on their physical protection, but also on the surrounding key management procedures that maintain the secrecy of the cryptographic keys in the device. These key management procedures are executed by specialized **key custodians** who are responsible for maintaining availability and secrecy of cryptographic keys. They are the people that allows our company to claim liability for the security of our keys, and therefore our payment data.
Developers of secure applications do not have to worry about key management themselves, which has three advantages:
* The possibility of security mistakes is reduced.
* There is more security, because the keys cannot be accessed.
* The programmer does not have to take the responsibility for cryptographic
security.
This is the reason that compliance frameworks and many policies (including those of Worldline) forbid the use of keys in applications.
In summary:
The main function of an HSM is separation between **key management** and **using the keys.**
## History of HSMs in the payment industry
To see why the HSMs are such an important part of application security, let’s start with a bit of history. About half a century ago the first digital payment applications started. It began with the installation of “Automated Teller Machines” (ATMs) that allowed you to get cash without needing to go to a bank. Not much later, “Point of Sale” (PoS) terminals allowed you to pay for your purchase with your card and PIN.
The computers in these days did not have much memory (we expressed memory sizes in MB instead of GB). To reduce the number of cryptographic keys that needed to be stored, most cryptographic keys were calculated from other keys. These “derived keys” allowed to do all encryption operations such as transaction verification and PIN verification with only a handful of keys.
For example, in the Netherlands, there was a single Master PIN key for PIN transportation between Interpay and the banks. This key was used to derive different keys for each bank. This way, all parties needed a single key for secure PIN transport: each bank used their derived key, while Interpay only needed to store the Master PIN key. Other operations, such as generation of PINs, also used keys that were derived from other keys.
This construction using master keys had one big disadvantage. The security of the master keys was the basis of the security of the entire system. Knowledge of such a key was enough to do a lot of damage. The designers of these systems realized that and decided to require that these keys were protected by HSMs.
The original list of security requirements from that era has (unfortunately!) been lost in time, but here are a few that I reconstructed from context:
* keys are always stored in an HSM (except for frequently changed session keys)
* HSMs are controlled by key custodians (both Interpay and the banks)
* all key operations are under dual control and use split knowledge (that is, no
key custodian ever sees the plaintext of a key)
* HSMs are connected directly to the server that handles the corresponding
payment applications
* there shall be no command to export master keys from the HSM
* payment applications do not contain any key material
* the server is situated in our own buildings (often on a separate floor within
the office building)
This was the application for which HSMs were designed. The design requirements for HSMs were documented and standardized at that moment. The current ISO standard for HSM security (ISO 13491) is still based on these assumptions, even though it is updated regularly (in fact, I am one of the editors of this document).
### Reality check
In the last few decades, the payment infrastructure is updated significantly and as a result, many details have been changed. As a result, their security assumptions are no longer true, and many requirements in the list above make no sense anymore.
Here’s a list of some changes in the way HSMs are used in the payment systems I have seen in my career at Worldline, in more or less chronological order.
```
| Change | Reason | Effect on security |
| ------------------------------------------- | ------------------- | --------------------------------------------------------------------------------------------- |
| HSMs are connected in a local network | Redundancy | Harder to detect wiretapping because there are so many wires |
| Servers moved to data centers | Efficiency | Less oversight on servers |
| HSMs moved with the servers | Efficiency | HSM-server cabling is no longer visible |
| HSMs connected via patch bay in data center | Data center | Connections can be changed without being noticed |
| Key management via secure link to HSM | Cost savings | Secure link becomes new attack target |
| Servers in virtual machines | Efficiency | Connections are determined by configuration files and can be modified without physical access |
| Key stored in encrypted database | Allow for more keys | Old keys can be used again by copying the database entries |
| Moving HSMs to the cloud | Efficiency | Key management no longer performed by key custodians |
```
For every change, the same reasoning was used:
* It is a small change, the security design is basically the same as it was.
* It is only a negligible amount of additional risk.
* We don’t want to redesign everything because that would be too costly.
After many changes, it becomes like the telephone game: many small changes can completely change the original intent.
=>
https://blog.worldline.tech/images/post/cloud-transformation/telephone_game.jpg Telephone game [image]
The final step in the table, were the HSMs are moved into the cloud, even **defeats the original purpose of HSMs**: key management is no longer explicitly separated from use of the keys. This is because the cloud provider does not support separation of key management from key use by default. In fact, they advertise how easy it is to have “automatic key management” for applications, allowing the application developers to set up the keys themselves.
The goal of this article is to make sure the next change will not add another reduction of security to this table.
## HSMs in the cloud
Cloud providers offer a feature called “Bring Your Own Key” (BYOK), which allows customers to store their existing encryption keys in the cloud provider’s HSMs. The obvious advantage of this is that new users of the cloud can start working without changing the keys. However convenient this sounds, there are two potential security risks here. First, the actual process of moving keys to the cloud is complicated and error prone. Second, the key is now stored in more locations than before, increasing the potential attack surface.
I even heard cloud providers advertise this claiming that under BYOK “keys are under your control”. This is misleading, as the converse is true: once your keys are uploaded to the cloud, you can never claim exclusive control over them anymore. For companies like Worldline this is specifically important, as we handle many keys owned by our clients.
The cloud provider may want to be able to move the keys from one HSM to another in order to be able to move their services to another location (since that ability is one of the cornerstones of the cloud provider’s services). This means that their way of storing HSM keys explicitly is designed to make moving keys easier, which works against the design criteria of an HSM. Furthermore, cloud providers do not want to take the liability for the key and associated data.
### Current implementations of “HSM in the cloud”
Current applications using HSMs cannot be moved to the cloud just like that. When these applications are moved to the cloud, the way they work with HSMs has to be changed. Worldline made an overview of the different ways to use HSMs with cloud applications. Unfortunately, each of them has significant disadvantages:
* The standard functionality of the cloud provider’s HSMs can be used. This
consists often of not much more than a way to provide storage keys that you can
use to encrypt data yourself; this is just good enough to prevent theft of the
database contents if the attacker cannot access this HSM.
* Some cloud providers have specific payment HSMs with specific functionality.
To use these, you may have to provide keys to the cloud provider. To be
honest, I do not understand that PCI compliance requirements can be met if you
don’t manage your own keys, but apparently PCI does allow this.
Also, many of the payment protocols in use by Worldline are not supported.
* We could hire a rack in the cloud provider’s data center and put your own HSMs
in there. This would allow to have all the functionality we need, including
proper key management with our own methods. Unfortunately, this is extremely
expensive, and furthermore, it means that we need multiple racks in different
data centers if we want to meet our requirements for redundancy.
* Finally, and this is the solution that is now often used in our company, you
can just leave the HSMs where they are in our data center, and have the cloud
applications “dial into” our data centers using a secure connection to get the
necessary functionality. This is sometimes called the “hybrid” solution.
It fulfills the necessary security and key
management requirements. The main disadvantages of this method are that the
connection is relatively slow, and that we still have to maintain hardware,
defeating the purpose of using the cloud in the first place.
## Towards security in the cloud
The main idea of this article is investigating this question from scratch, without getting distracted by the current solutions:
**how do we do key management in the cloud?**
As is clear from the discussion above, using HSMs in the cloud is not enough. As terrifying as it is, I can only conclude that the solution to our problem is to make something **in software** that should satisfy the same objectives as originally stated for HSMs, but adjusted for the world we currently live in.
I suggest to use the name “Cloud Security Module”, or CSM. This is not to be confused with “virtual HSM” where a physical HSM has different presences for different applications.
### High level requirements
Let’s start with a description of what a CSM should do, at the highest level possible. From the idea “what an HSM does, but then in the cloud”, we get the following four basic requirements:
* 1. Prevent keys from being read intentionally or on purpose, both by internal
and external parties.
* 2. Provide a means to separate the key management activities from
cryptographic operations.
* 3. Support key management procedures used by key custodians,
allowing the custodians to take their responsibilities.
The procedures in question must be comparable to current procedures.
* 4. Operate within a cloud environment, performing cryptographic operations and
protecting the keys.
Before we go further, let’s specifically address the elephant in the room. The change from HSM to a software CSM can be seen as just another change in the way we handle keys and add it to the list of changes in the table above. And that is a fair point, the security design is changed again. I am not going to deny that.
Storing keys in the cloud sounds like a great idea. The HSM provides physical protection against key manipulation and theft. The cloud provider’s data centers provide a very secure environment for the HSM. The cloud provider will have to protect this since their entire company depends on this.
But it is not all about physical security. Storing keys in the cloud directly means that you trust the cloud provider with the keys. Keeping keys secure is not only about the way you store them but also about the key management and the procedures around it: the logical security.
In order to get a balanced view on the security of this alternate solution, it is important to understand the risks of both solutions and compare them in a fair way.
Let’s compare protection of keys against leakage between HSMs and software.
```
| Risk | HSM | Software |
| ----------------------- | ---------------------- | ----------------------------------- |
| Verification | Hard to check hardware | Source could be audited¹ |
| Dependency² | Manufacturer | Cloud provider and operating system |
| Security design | Often proprietary | Visible in source code |
| Location of keys | Inside enclosure | Confidential computing³ |
| Government influence⁴ | Can be invisible | Hard to hide |
| Security audits | Not adjusted for cloud | Not yet developed |
```
Clarification of the entries in the table:
* 1. I will assume in this that the source code of the software is visible to
the user of the CSM.
I don’t think that a CSM based on secret source code is to be recommended.
* 2. “Dependency” means which party you are implicitly trusting with your keys.
As long as you don’t build your own hardware, there is always somebody
you’re depending on.
* 3. Confidential computing is a technology that encrypts the communication
channel between a processor and memory.
This technology is specifically created for running secure code in a cloud
environment.
It is elaborated further below.
* 4. I acknowledge here that many governments cannot resist the
temptation to try to get access to cryptographic key material.The “legal access” methods attempt to maintain the general security but
allow a way to get access for governments; this tends to be not secure
because they backfire, and attackers abuse them to eavesdrop