System Bits: Nov. 1

Avoiding cloud malice; quantum-processing platform; machine-learning training.

popularity

There is a lurking malice in cloud hosting services
A team of researchers from the Georgia Institute of Technology, Indiana University Bloomington, and the University of California Santa Barbara has found — as part of a study of 20 major cloud hosting services — that as many as 10 percent of the repositories hosted by them had been compromised, with several hundred of the ‘buckets’ actively providing malware.

This map shows locations where the impacts of bad repositories (Bars) occur. (Source: Xiaojing Liao, Georgia Tech)

This map shows locations where the impacts of bad repositories (Bars) occur. (Source: Xiaojing Liao, Georgia Tech)

However, the team pointed out that such bad content could be challenging to find because it can be rapidly assembled from stored components that individually may not appear to be malicious.

To identify the bad content, the researchers created a scanning tool — called BarFinder that looks for features unique to the bad repositories, called ‘Bars,’ including certain types of redirection schemes and gatekeeper elements designed to protect the malware from scanners.

Raheem Beyah, a professor in Georgia Tech’s School of Electrical and Computer Engineering said, “Bad actors have migrated to the cloud along with everybody else. The bad guys are using the cloud to deliver malware and other nefarious things while remaining undetected. The resources they use are compromised in a variety of ways, from traditional exploits to simply taking advantage of poor configurations.”

Beyah and graduate student Xiaojing Liao found that the bad actors could hide their activities by keeping components of their malware in separate repositories that by themselves didn’t trigger traditional scanners. Only when they were needed to launch an attack were the different parts of this malware assembled.

Interestingly, some exploits appear to be benign until they are assembled in a certain way, they pointed out. So when the components are scanned in a piecemeal kind of way, only part of the malware is seen, and the part seem may not be malicious. Malicious actors take advantage of how difficult it can be to scan so much storage, while cloud hosting services may not have the resources to do the deep scans that may be necessary to find the Bars – and their monitoring of repositories may be limited by service-level agreements.

Overall, the researchers said they scanned more than 140,000 sites on 20 cloud hosting sites and found about 700 active repositories for malicious content. In total, about 10 percent of cloud repositories the team studied had been compromised in some way. The researchers notified the cloud hosting companies of their findings before publication of the study.

To protect cloud-based repositories from these attacks, the researchers recommend the usual defenses, including patching of systems and proper configuration settings. 

Delicate balance between coherence and control
Building a quantum computer with the intention of making calculations not even imaginable with today’s conventional technology is an arduous effort, according to researchers of the Martinis Group at UC Santa Barbara, along with Google, and Tulane University in New Orleans, who are exploring the exciting but also still somewhat counter-intuitive world of quantum computing, and have demonstrated a relatively simple yet complete quantum processing platform, by integrating the control of three superconducting qubits.

Members of the John Martinis quantum computing group (l to r) : Charles Neill, Pedram Roushan, Anthony Megrant and John Martinis (Source: UCSB)

Members of the John Martinis quantum computing group (l to r) : Charles Neill, Pedram Roushan, Anthony Megrant and John Martinis
(Source: UCSB)

The researchers believe they are probing the edge of their capability. There have been quite a few efforts to build and study individual parts of a quantum processor, but this project involves putting them all together in a basic building block that can be fully controlled and potentially scaled up into a functional quantum computer.

But before a fully practicable quantum computer can be made, various and sometimes unpredictable and spontaneous circumstances arise that have to be understood as the researchers pursue greater control and sophistication of their system.

This means dealing with particles (qubits) that are interacting with one another, and interacting with external fields, which leads to very complicated physics. To help solve this particular many-body problem, the fully controllable quantum processing system had to be built from a single qubit up, in order to give the researchers opportunities to more clearly understand the states, behaviors and interactions that can occur.

By engineering the pulse sequences used to manipulate the spins of the photons in their system, the researchers said they created an artificial magnetic field affecting their closed loop of three qubits, causing the photons to interact strongly with not only each other, but also with the pseudo-magnetic field.

However, with more control comes the potential for more decoherence. As the researchers strove for greater programmability and ability to influence and read the qubits, the more open their system was likely to be to error and loss of information.

To combat the potential for error while increasing their level of control, the team had to reconsider both the architecture of their circuit and the material that was being used in it. Instead of their traditionally single-level, planar layout, the researchers redesigned the circuit to allow control lines to cross over others via a self-supporting metallic bridge. The dielectric — the insulating material between the conducting control wires — was itself found to be a major source of errors so a more precisely fabricated and less defective substrate was brought in to minimize the likelihood of decoherence.

The team is working to increase speed, which is essential for the kind of performance they want to see in a fully operational quantum computer. Slow speeds reduce control errors but make the system more vulnerable to coherence limits and defects imposed by the materials. Fast speeds avoid the influence of defects in the material but reduce the amount of control the operators have over the system, they said. With this platform, however, scaling up will be a reality of the not-too-distant future, they said. If they can control the systems very precisely — maybe at the level of 30 qubits or so — they can get to the level of doing computations that no conventional computer can do.

The basis for machine-learning system decisions
According to researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), the best-performing systems in artificial-intelligence research have come courtesy of neural networks, which look for patterns in training data that yield useful predictions or classifications that can be trained to recognize certain objects in digital images or to infer the topics of texts. The downside: neural nets are black boxes. After training, a network may be very good at classifying data, but even its creators will have no idea why. With visual data, it’s sometimes possible to automate experiments that determine which visual features a neural net is responding to. But text-processing systems tend to be more opaque. Now, the MIT team has devised a new way to train neural networks so that they provide not only predictions and classifications but rationales for their decisions.

In any domain where the cost of making the wrong prediction is very high, you need to justify why you did it, they pointed out. As for a broader aspect to this work, Tommi Jaakkola, an MIT professor of electrical engineering and computer science said, “You may not want to just verify that the model is making the prediction in the right way; you might also want to exert some influence in terms of the types of predictions that it should make. How does a layperson communicate with a complex model that’s trained with algorithms that they know nothing about? They might be able to tell you about the rationale for a particular prediction. In that sense it opens up a different way of communicating with the model.”

Neural networks are called this because they mimic — in approximation — the structure of the brain. Composed of a large number of processing nodes that, like individual neurons, they are capable of only very simple computations but are connected to each other in dense networks, the team explained.

In the deep learning process, training data is fed to a network’s input nodes, which modify it and feed it to other nodes, which modify it and feed it to still other nodes, and so on. The values stored in the network’s output nodes are then correlated with the classification category that the network is trying to learn — such as the objects in an image, or the topic of an essay.
Over the course of the network’s training, the operations performed by the individual nodes are continuously modified to yield consistently good results across the whole set of training examples. By the end of the process, the computer scientists who programmed the network often have no idea what the nodes’ settings are. Even if they do, it can be very hard to translate that low-level information back into an intelligible description of the system’s decision-making process.

In this new work, the MIT researchers specifically address neural nets trained on textual data. In unpublished work, they said they’ve applied it to thousands of pathology reports on breast biopsies, where it has learned to extract text explaining the bases for the pathologists’ diagnoses. They’re also using it to analyze mammograms, where the first module extracts sections of images rather than segments of text.