White-Box Crypto Gains Traction

A niche obfuscation methodology is becoming more popular in cost-sensitive markets. But how good is it?


Ask any cryptography expert which is better, hardware- or software-based cryptography, and they’ll almost always choose the hardware. But as the IoE begins to take root in cost-sensitive markets with tight market windows, that won’t always be an option.

Plan B is software cryptography, which historically has been used at the application level in the form of anti-virus, anti-spyware, and software packages. But as more things are connected, simply slapping an antivirus/malware solution of one sort or another will no longer suffice.

IoE devices will have a wide gradation of technology, from ultra-simple to ultra-advanced, which means security will have to be extremely adaptable to stratify across them. In many cases, there won’t be enough available resources or longevity to justify black-box hardware cryptography. “Pretty good” will have to suffice, and in this case the best choice is white-box cryptography.

“There is one significant advantage to white-box cryptography,” says Aaron Lint, research director at Arxan, which makes anti-tamper software. “Even when an attacker is looking at the code and is painstakingly stepping through the code, instruction-by-instruction, or data element-by-data element, no matter how much they look they won’t find any semantically important information.”

As technology advances, various metrics of both hardware and software become more scalable, power-efficient and extensible – especially with the IoE as the target market. This translates into solutions having a wider applicability base as technology progresses, with the ability to cross-pollenate across platforms, systems and networks. Where hardware cryptography once was the solution, now software is an option, and vice versa.

One of the applications that benefits most from white-box cryptography is digital rights management. Media content is now a common application across the Internet. The theft of copyrighted material is a huge problem because the content and the systems that manage it are so varied. Therefore, it is easiest to encrypt the entire package rather than just the critical elements.

“With streaming media, for example, the decryption is done at the set-top box,” notes Chowdary Yanamadala, vice president of business development at Chaologix, “But that poses a problem because the set-top box is fully under the control of a legitimate user (or hacker, as the case may be). So how does one install a decryption mechanism, knowing full well that anybody can have access to the set-top box?”

The short answer is this is one of the most common applications where white-box cryptography can be used effectively.

What it is
White-box cryptography is a way to protect the software implantations of cryptographic algorithms. Generally, the user has the ability to control the execution environment, such as a set-top box, says Lint. “In effect, white-box cryptography is a variant of obfuscation, with both the source code itself and the mathematics in play in the cryptography. So it is a very complicated, almost ‘unrolling’ of the algorithm into a somewhat less efficient but vastly harder case to decipher.”

For example, an application can provide the ability to decrypt AES cipher texts under a certain key without revealing the key itself. This happens when subscribers access secured digital content without needing to publish their own key over the Internet, typically on open devices such as PCs, tablets, and smartphones. But because such devices are “open,” they are extremely vulnerable to attacks. An attacker can gain complete control over both the software implementation and the execution platform. The goal of white-box cryptography is to create an application or program that is tamper-resistant and can be executed safely in an untrusted environment.

In contrast, a black-box approach only shows inputs and outputs of the crypto algorithm. With the black-box method, a hardware-based root of trust such as a trusted platform module is used to protect the keys that validate the code.

The same white-box cryptography approach does not require the hardware, removes the trusted platform module or other hardware element, and places everything in software. This open environment permits the attacker to analyze such things as the application’s binary code and the corresponding memory pages during execution. In addition, the attacker can intercept system calls, alter the binary and its execution, and interrogate the device with such tools as IDA Pro, debuggers and emulators. And because white-box attacks have visibility to the binary, and complete access to the software implementation of a cryptographic algorithm, the attacker can alter the environment via access to the execution platform (memory, CPU and registers).

The only way to thwart such attacks is to protect the whole environment. Without this, attackers could extract secret keys from the binary or from memory, or simply intercept information that leads to disclosure during execution. White-box cryptography is designed to keep the secret keys from being discovered in such an environment. In some fashion it can be called a cousin to obfuscations. But because white-box cryptography has the ability to keep the attacker from recovering the key from the executable, it has a clear advantage. This is another reason it is used in applications where digital rights management is involved, because it is more cost-effective versus typical token-based, or hardware implementations. It also can be used for such applications as Europay, MasterCard or Visa (EMV) payments on NFC-enabled smartphones without secure element.

From a cryptographic point of view, it can be viewed as a special-purpose code generator that turns a given cipher into a robust representation, where the operations on the secret key are combined with random data and code. That’s done in such a way that the random data cannot be distinguished from key information.

What it does
Implementing white-box schemes is generally done by creating a key-instantiated version that hides all the information related to the key. That creates a key-customized software version that allows access to the data without having to input the key.

With symmetrical-type of block ciphers (AES, DES), the general methodology is to use linear transforms and substitution boxes. In the white-box implementation it becomes a very large look-up table where any plaintext is input, and the output is the corresponding ciphertext in the table (See Figure 1). However, with 64-bit,128-bit or higher plaintext values, the lookup tables are becoming too large for practical implementation.

Figure 1. Representation of lookup table-based WBC implementation. Source: DeveloperIQ

To circumvent that, these large plaintext values are broken down into more manageable segments. Each one has its own lookup table, the output of which is encoded (see Figure 2).

Figure 2. A high-level overview of a fixed key implantation of WBC, for the case of a fixed key implementation. Source: Whiteboxcrypto

There are various schemes to accomplish that today, and with the renewed interest in white-box cryptography there is a renewed interest in improving it with new techniques.

White-box cryptography rewrites the key-instantiated version to hide key-related information. As shown in figure 2, algorithms take the key and cipher text shown on the left side and create the series of smaller blocks show on the right. Each box represents a series of lookup tables with random input/output bijective encodings that introduce a measure of ambiguity. That produces algorithms that appear to be a distribution of lookup tables with randomized values. Such a standard approach is relatively secure for most non-critical applications such as music or video, and for some data applications.

Implementation of white-box encryption of a key can be done using a compiler. There are many available so the actual methodology varies among them, but the general principle is the same. Let’s take a look at an example.

Assume a sound compiling function, C, that has taken a key: k ∈ M, and r ∈ R, taken from a random space, and creates a compiled program; Screen Shot 2016-04-04 at 2.22.20 PM (or Screen Shot 2016-04-04 at 2.18.20 PM, if the random r is implicit or nonexistent). Thus, [Erk] = Cε (k, r). So for any input, m M, [Screen Shot 2016-04-04 at 2.18.20 PM] will always returns the correct encryption c = E(k, m). Because this assumes a sound compiler and deterministic encryption (encryption and decryption are inverse functions), such compilers can be used for decryption, as well.

If additional security is desired, such as key-independent functions, the encryption function Screen Shot 2016-04-04 at 2.22.20 PM and its reciprocal E k^-1 (the decrypt function) with the composition E’K = G ◦ EK ◦ F^−1 (respectively, E K^−1 = F ◦ E K^−1 ◦ G^−1) can be restricted as follows.

Input encoding function F and output decoding function G^−1 (respectively, G and F^−1) should not be made available on the platform that computes E’K (respectively, E’K^−1), so that the white-box implementation cannot be used to compute E’^K (respectively, E’^K −1). Although the resulting implementation is not standard, such an approach is reasonable for many digital rights management applications and adds some additional security. [See reference 1]

How white boxes are hacked
Like any other cryptography methodology, white-box cryptography is subject to hacking. Some say it is easier to hack than black-box cryptography, but proponents of white-box cryptography say that with sufficient resources it can be as strong. It’s difficult to compare them because white-box cryptography uses a different model.

Below are some of the methods used to compromise white-box cryptography. Except for code lifting, the other methodologies focus on key recovery.

Code lifting. The most common weakness that white-box cryptography reveals is code lifting (also called cloning). “If the entire code can be acquired, say from that set-top box, and executed on a laptop, for example, there is no need to try and pull any keys,” says Yanamadala. “It simply is executed on the device that it is cloned to.” This has been call the man-at-the-end attack because the compromise comes at the end of the stream, rather than at some point along the stream (man-in-the-middle attack).

Code lifting does not try to extract the key itself. It addresses the entire application as if it were just one big key. The security library’s decryption functionality is called, directly, to exploit its functionality.

Control flow manipulation attacks. As white-box cryptography implementations are executed, it is possible to manipulate the control flow as it runs. Doing this can provide clues about the key. At the end of the execution, operations can be skipped, and noting how this affects the output of the data can be used to hack a system. This is similar to fault injection attacks on cryptographic algorithms in hardware.

Data manipulation attacks. There is some evidence that data manipulation at intermediate stages, within the algorithm, could produce some information about the key. The attacker tries to modify some of the intermediate data near the end of the algorithm. This is usually done just before the last round starts. The data is collected and compared to the various results. Some results will be correct, others incorrect, so analysis of them can reveal some information related to the key. This is also similar to fault injection methods in hardware.

Statistical analysis attacks. Statistical attacks also are based on observing intermediate data. As the white-box cryptography implementation is executed, running a statistical analysis on it may allow the attacker to retrieve secret key information. For this, an attacker typically observes the data during the first or last rounds of the cryptographic algorithm. This data is collected, along with results from unprotected data, and the relationship between the observed data from the rounds, and unprotected data, as it is computing, is analyzed. This is similar to the side channel type of attacks against hardware cryptographic algorithms.

White-box cryptography and the IoE.
The IoE is going to have massive amounts of data flowing from local networks that contain thermostats and pacemakers to global devices such as trains and jets, to infrastructures and more. Security requirements will vary widely.

However, many if not most of the IoE devices will be of low complexity. A thermostat, for example will have the same control in it that is has now, except with the additional network interface and control and reporting functionality. From an economic perspective, thermostats are a mature consumer product—even the programmable ones are mature—so adding an expensive cryptography chip isn’t going to be received with open arms by their manufacturers.

Obfuscating the code sent in over-the-air software, or doing something similar for the code in the thermostat is a very good way to make it too expensive and time consuming for a hacker to justify their time and energy. In this case, it is just a matter of some additional programming rather than adding a chip, tying it in with both software and hardware, and reflowing the board. If the security risk is acceptable, then white-box cryptography is certainly an option.

The main concerns with white-box cryptography are the performance and size penalties, and its security. The performance issues limit the exploitability of white-box cryptography for high-throughput or constraint use cases such as mobile systems. This can be addressed using special-purpose white-box implementations that can exploit hardware accelerators.

But that becomes expensive. The real advantage to white-box cryptography is that it can be done inexpensively. Its real benefit is where lightweight security suffices, which are mostly low-value targets as will exist in the IoE, or situations where security at the end points cannot be guaranteed, such as multimedia. In cases where security protects things such as personal data, financial information, or mission critical data, it is best to use the hardware approach.

In the end, white-box cryptography is a valuable tool for certain cases. And as with other technologies, it will continue to evolve.

Reference 1: Marc Joye. On White-Box Cryptography. Thomson R&D France Technology Group, Corporate Research, Security Laboratory, 1 Avenue de Belle Fontaine, 35576 Cesson-S´evign´e Cedex, France.

Related Stories
Battle Looms Over Mobile Payments
Host card emulation and secure element are vying for the instant payment market.
Cryptography for ULP Devices
Getting both efficient and effective cryptographic operation for ultra-low power devices will be a challenge for the IoE.

Leave a Reply

(Note: This name will be displayed publicly)