Data Center Security Issues Widen

The number and breadth of hardware targets is increasing, but older attack vectors are not going away. Hackers are becoming more sophisticated, and they have a big advantage.

popularity

The total amount of data will swell to about 200 zettabytes of data next year, much of it stored in massive data centers scattered across the globe that are increasingly vulnerable to attacks of all sorts.

The stakes for securing data have been rising steadily as the value of that data increases, making it far more attractive to hackers. This is evident in the scope of the attack targets — today that can include software, hardware, and the networks and interconnects through which data moves — as well as the rising security budgets for governments and companies. But as the number of attacks and the sophistication of the attackers continues to increase, along with the growing complexity of devices used to process and store all of this data, the need to secure systems has taken on a new sense of urgency.

A recent report by Stuart Madnick, professor of information technology at MIT, found that between 2022 and 2023 more than 2.6 billion personal records were breached, up from 1.1 billion in 2021 and 1.5 billion in 2022, and that 80% of data breaches involved data stored in the cloud. Moreover, Madnick found that 98% of organizations have a relationship with a vendor that was breached in the last two years. On U.S. government networks, there were 14 million known exploited vulnerabilities remediated and 900 million malicious DNS requests blocked, according to the Cybersecurity & Infrastructure Security Agency.

Put simply, the number of new attack vectors is growing, but the old ones aren’t going away. This has led to significant efforts to beef up security on all fronts.

“Data center chips are, just by the nature of the location and the U.S. model, generally quite physically secure,” said Lee Harrison, director of automotive IC solutions at Siemens EDA. “They’re typically locked up in a data center with quite strict access, compared to some other areas. If you take into account IoT or automotive, where people can physically get their hands on the chips and physically tamper with them, data center devices are actually quite secure from a physical perspective.”

It’s not just the CPU/GPU/NPU, or the memories connected to them that are undergoing changes. One large processor vendor, for example, has been ratcheting up security for power management chips that communicate directly with various processing elements.

“There is communication between the power management and the GPU, or the main CPU, for how much current or power is demanded, and what the voltage variation should be,” said Davood Yazdani, senior director of product marketing at Infineon. “That sets the voltage and current and temperature. If you get control of that communication, you can easily drop the voltage and cause a shutdown, or you could damage the power management or even the main CPU. Especially in the server applications, we are seeing requests for certain security features to be implemented in the power management devices, or any CPU or GPU that has a digital interface, because that’s an area that might get compromised.”

All of this adds to the cost and the complexity of systems. But it doesn’t guarantee data will remain secure. With enough attempts and planning, even old-school approaches sometimes work. Consider the SolarWinds attack in 2020, for example, which provided back-door hardware access to some 18,000 customers, including many U.S. government agencies. That attack served as a reminder that all avenues of attack need to be considered.

Of particular concern are attacks that take advantage of increasing digitalization and the flood of data being processed and stored. That has driven rising complexity in both hardware and software, in addition to many more interactions between workers (particularly with hybrid office/home work schedules). Coupled with higher-value data being collected, processed, and stored online, there is a growing incentive for hackers of all types to begin probing for weaknesses at every level. That includes stealing data with state-of-the-art encryption, which cannot be breached today, but which is very likely to be breached at some future date with more powerful computers and more collaboration among hackers.

“More and more parties started to use data centers to store their data because they wanted to have fast team connections — fast connections from home to SharePoint,” said Gijs Willemse, senior director of product management for Rambus Security. “Companies don’t maintain local networks or storage anymore. They really moved to data centers, and with that, the most valuable data moved to data centers. That’s the reason why we saw the bar being raised, and we’re at a point where governments also start to protect and store data in those gigantic centers, just because it’s more efficient.”

Much of that data is still connected, however, and in many cases what it’s connected to — operational technology and industrial control systems, for example, — are not state-of-the-art. “They were designed for zero access, but it’s like they were designed in a bubble because they didn’t need internet access or the ability to push logs out,” said Jim Montgomery, principal solutions architect at txOne Networks. “They weren’t moving logs to the cloud or allowing remote access. But over the last few years, they’ve had to adjust everything to accommodate for situations like COVID, where they weren’t allowing anybody into the environments. That accommodation typically added some sort of remote access, and they basically created their own monster. You can combat those things with segmentation and IDS (intrusion detection systems) or IPS (intrusion prevention systems), but once you create that access into the environment, you’re creating an avenue for attack.”

Jason Oberg, founder and CTO of Cycuity, noted that security in general is often an afterthought, one made even less of a priority thanks to the fragmented nature of the component market. “Whether it’s a chiplet architecture or you’re licensing IP and building a chip, because everything is so fragmented, there are so many different stakeholders,” he said. “At different stages of that process people say, ‘Here’s the assumptions I made on security. Here’s what we validated. Here’s what we expect you to do, Mr. consumer of my chiplet, or IP, or chip, or whatever it may be. That provides that transparency people know what to focus on. Unfortunately, right now a lot of the industry is like, ‘Hey, here’s a chip, here’s the user manual, or here’s an IP, here’s the user guide. Go build your system off of it.’ There’s not a lot of good clarity in terms of what’s been done for security.”

What’s changed?
In the past, many attacks were focused on software, because it was remotely accessible. But using that as an entry point to the hardware allows hackers to gain control of the entire system, and potentially systems of systems. That, in turn, makes it more difficult for companies to regain control of their servers. A good example of this is the rowhammer attack, which can cause a disturbance in rows of DRAM cells by accessing them faster than the memory can respond, providing an opening for a hacker to take control of the memory and change the access privileges. Increased density, which is needed to store more data, has been essential for improving processor performance, but it also has exacerbated this weakness.

“Channel leakage is one area of concern, but rowhammer attacks are equally concerning,” said Lang Lin, principal product manager at Ansys. “They’re getting the benefit of coupling between the row with the neighbor. It’s leaking out the data from the neighbor. Imagine if the neighbor stores some data in one row. When you stress this row, you probably can see some pattern versus the data storing is zero. When stressing the system, there’s some pattern you can monitor from the response of the system. That’s one of the ways to attack the data.”

Decomposing SoCs into chiplets adds multiple attack surfaces. “Data center processors and SoCs are being built as disaggregated designs with chiplets,” said Arif Khan, senior product marketing group director for design IP at Cadence. “The security implications of chiplet-based manufacturing with provenance need to be factored in. However, this is a one-time scenario as opposed to a dynamic run-time situation in data centers.”

This is just one more concern to add to an ever-expanding list. “Protection for stored data, and data being transmitted, has been covered by various security mechanisms since the early days of networking and internet development,” said Khan. “The rise of cloud computing and data centers now brings the need for ‘confidential computing’ to protect data as it is being processed. Confidential computing relies on trusted execution environments (TEEs). The TEE is a secure computing area that can execute code with a higher security level and access sensitive data.”

In general, data in motion almost always is more vulnerable than data at rest. This is particularly true for data being transferred to a CPU, a GPU, or back to the memory array, and at the point of encryption/decryption. The good news is that while this all happens in very short time windows, it can raise an alert based on a predictable and measurable shift in the power profile.

“The power consumption of the system is dramatically different,” said Lin. “The delay of the operation is different. It is all data-dependent. And these things can be actually measured in real-time by the aggressor. Let’s say I’m just dialing into the data center. I send a message with all zeros, measure the delay. Then I send all ones and measure the delay. The delay is different. Imagine that all hackers have a way to actually characterize this so-called spy channel, meaning that you’re not directly probing the data for ones or zeros, but you’re looking at the physical behaviors associated with this. This is a side-channel attack.”

Identifying these patterns isn’t free. It requires additional resources, such as dummy circuits for obfuscation, as well as the power needed to identify any unusual activity. That, in turn, adds to the initial price tag, the complexity of the initial design, and the overall power budget.

“There are costs to secure a system,” Lin said. “In order to make my motion not leak the data, you probably have got to add some noise. You’ve got to add a lot of circuits around it so that some dummy circuit is always operating. Then, the way you measure the difference between sending different data is overwhelmed by the noise. This is one way to mask the leakage.”

It’s not the only way, however. “I don’t know a lot of companies who are selling a lot of parts in this space that are really doing a whole lot of that,” said Mike Borza, principal security technologist at Synopsys. “It’s a little bit like the honeypot approach to internet firewalls, just putting an open or seemingly open server out on the internet and seeing what it attracts. You see if you can get some people to show up at your doorstep and start trying to take it over. Most chips that are specified have a difficult time shoe-horning the functionality they do need in the chip without going to the trouble of supplying a lot of extra functionality.”

Dummy circuits are not the only line of defense. Lin pointed to various other approaches that can be implemented at the design level in order to prevent outside forces from reading a chip’s activity.

“Most of the [defense tactics] are at the gate level, then patch to the system level where you start to have, let’s say, one chip with countermeasures at the logical level, but the others are not. Would that actually compromise the whole system security? People would like to know that, and it could be a system-level countermeasure also there for, say, side channel leakage,” he said. “It could be electromagnetic leakage from the server chip, so maybe they will put an EM shield on top of this chip so nothing can be measured. It is similar for power. You probably can put some sensor in to say, ‘If I see some data, let’s show some pattern. Let’s do something to stop doing anything.’ That’s at the system level. You don’t do anything on the gate level, but it’s only monitoring the whole system. That you can do to a different level.”

One key for maximizing security is the idea of separation. With many different clients having access to a data center, keeping their information from getting crossed over is of paramount importance.

“If you run an application for Customer One, you don’t trust the other application that you’re running for Customer Two,” said Rambus’ Willemse. “You need to separate those applications, separate the data. In today’s environments, that’s the base requirement that has been pushed by the users of data centers, and that drives the security of the SoC. That starts with the root of trust, which is the very first chip identity within the chip that needs to be protected from the actual applications running on the chip. But also, the keys that are derived from that are the keys that are generated during the lifetime of the chip or during the real-time processing.”

Others agree. “It’s critically important that chip vendors, software platforms, OEMs, and CSPs can deploy and access standardized Root of Trust services,” said Chowdary Yanamadala, technology strategist at Arm. “Security is complicated. You need the whole value chain to work together.”

Security experts have been touting the need for hardware security for nearly two decades, but until recently that has been considered an add-on for many designs. Concerns finally reached a critical stage at the end of the 2010s, as more electronics were added into vehicles and medical devices, and more companies began using cloud-based services. This was accompanied by increased penalties for breaches in some regions, and stringent reporting requirements with late fees.

“Five years ago, Arm noticed an opportunity to proactively improve the quality of chip security,” Yanamadala said. “IoT was in its early stages, and each chip vendor had varied and fragmented approaches to security. They also rarely approached an independent evaluation lab to check the robustness of their security implementation. With devices increasing their connectivity and data becoming more valuable – hackers were paying close attention, and governments were considering what action to take to protect consumers. That’s why in 2019, we launched PSA Certified, to rally the ecosystem to be proactive with security best practice and democratize device security through independent security evaluation.”

Stacks and stacks of vulnerabilities
This is harder than it sounds. Every level of the networking protocol stack, every interconnect, and every access point is a potential vulnerability. Assuming all of that can be bundled up tightly enough, the first step is making sure that the original key, and the keys that are added for sessions, are generated and managed by an independent entity on the chip, and that they are not impacted by any of the other processes running on the chip. This effectively is an in-system air gap, and it’s a necessary starting point for any security.


Fig. 1: NIST’s approach to security. Source: NIST

“The second part is an advisory or somebody that is coming from the outside and tries to come in or tries to monitor interfaces or basically anything that’s not related to the chip itself,” Willemse explained “And that comes with additional requirements for secure communication. Originally, that was between data centers. There was communication security from rack-to-rack, and nowadays securing communication chip-to-chip must be done, so even PCI interfaces are protected with security so that data that is transferred over these interfaces is also secure.”

Willemse noted that Rambus has several layers of security it can physically build into a chip in order to guarantee as much separation as possible. “That’s a dedicated hardware module, which is separated from the main CPU, and has its own sub-system that cannot be accessed directly. It runs its own proprietary firmware. Users can develop their own firmware inside, but it’s not exposed to the system. That’s the core of the security of the system. If you talk about physical protection, on top of the fact that there’s a dedicated module on the chip, we also add contraction against side channel attacks.”

Siemens’ Harrison said that some of the components added to chips specifically for security have gotten so efficient at what they’re measuring that they don’t do much to impact performance.

“What we have essentially is a collection of core functional monitors,” he said. “These are small pieces of IP that typically are inserted in the design during the overall initial design phase. They’re relatively unobtrusive. They don’t impact performance, because they’re passive monitors that are hanging off functional elements in the design. They may take up a little bit of area, and a little bit of power, so if you’re looking for tradeoffs, they’re in the areas where if you were to put thousands of these monitors on the chip, you’re going to start to see a significant impact on area and power. But typically what our users are doing is being very targeted on where they put these monitors to make sure that they catch all the critical areas in the design.”

Security is not static
As technology evolves, so do the threats. Currently, people working in data center security have an eye on artificial intelligence and the ways it may be used to crack into the facilities. But even more potent threats are on the horizon.

“If AI is used on large scale to attack a system, it will probably attack those security mechanisms that are generally designed by the chip makers itself, or bought by companies like us, because there’s no specification to do the security. There is no specification to design how you protect your windows or your doors,” said Willemse. “There is a lock with the key, but if you can just push it and open the door with a lock itself, it’s not secure. There are certain Zero Day attacks on CPUs that were discovered a couple of years ago, including Heartbleed. Those could be revealed more easily with AI because you can put in a lot of data and attack scenarios at the same time in a device.”

But for all the power that an AI-driven attack can have behind it, that could pale as the full potential of quantum computing is realized. Developers already are taking this into consideration, trying to build security into products that could still be in use by the time quantum-powered attacks are feasible.

“If you don’t protect your chip now, and it exists for five, six, or seven years down the line, all of a sudden it’s there. You need to be sure that you’re ready for it today,” said Willemse. “Another concern that exists is that quantum computing is expected to break key negotiations. So if you capture data today, including the key negotiation, a quantum computer will be able to retrospectively extract the key and decrypt the data. Data that is now communicated, for which the key is exchanged with current protocols, can retrospectively be decrypted unless there is quantum-safe protection.”

With all these concerns, the initial conception of a chip has become ever more important. Security must be factored in from the ground up, and EDA is a vital part of policing and monitoring data.

“Chip companies have asked us to do something at the design stage, so they don’t have to wait for the silicon to be bad and find the loopholes,” said Ansys’ Lin. “We need to fix the security as soon as possible. It’s a big trend, one that the semiconductor chip companies have started to ask us about solutions. Many of them started approaching us saying they need something to help their engineers, so the EDA industry is in an important position. We probably need five years to have mature technology to do the simulation for security. Another challenge is that the security problem is so broad. It’s not like power that is measured by watts. In security, there is supply chain security, side channel leakage, fault injection, and counterfeiting, among other things, and in order to have EDA tools to handle these different aspects, I think it’ll take more than five years.”

Conclusion
Security is an ever-evolving concern when it comes to chip design, manufacturing and implementation, particularly as data center use continues to expand. While there are some very solid measures already in place that are constantly being advanced, the AI boom is a huge opportunity for the industry, but also a danger to those seeking to keep gigantic volumes of sensitive data tucked safely away. Quantum computing will move this war of attrition between security experts and attackers into another level of complexity.

Attackers will continue to grow in sophistication. Breaches will occur. But as Siemens’ Lee put it, “It’s not just about staying one step of the attackers, but also about what you can learn from them. The challenge is always based on not trying to stop every attack, but trying to make the ability to attack harder. There’s always the next rotation, and at some point, an attacker will be able to breach the system. That’s where a number of users are employing monitoring technology, and talking to us about the fact that it’s not so much the protection, but if there is an attack, whether we can provide all of the forensic data to do the analysis that led up to the attack.”

Further Reading
Security Becoming Core Part Of Chip Design — Finally
Dealing with cyberthreats is becoming an integral part of chip and system design, and far more expensive and complex.
Increased Automotive Data Use Raises Privacy, Security Concerns
More valuable data is being generated by vehicles and collected by a variety of companies. Who actually owns that data isn’t clear.



Leave a Reply


(Note: This name will be displayed publicly)