Power/Performance Bits: July 18

Ad hoc cache hierarchies; spintronic switch; low-power temperature sensor.


Ad hoc “cache hierarchies”
Researchers at MIT and Carnegie Mellon University designed a system that reallocates cache access on the fly, to create new “cache hierarchies” tailored to the needs of particular programs.

Dubbed Jenga, the system distinguishes between the physical locations of the separate memory banks that make up the shared cache. For each core, Jenga knows how long it would take to retrieve information from any on-chip memory bank.

The researchers tested their system on a simulation of a chip with 36 cores. They found that, compared to its best-performing predecessors, the system increased processing speed by 20% to 30% while reducing energy consumption by 30% to 85%.

“What you would like is to take these distributed physical memory resources and build application-specific hierarchies that maximize the performance for your particular application,” said Daniel Sanchez, an assistant professor in the Department of Electrical Engineering and Computer Science at MIT.

“And that depends on many things in the application. What’s the size of the data it accesses? Does it have hierarchical reuse, so that it would benefit from a hierarchy of progressively larger memories? Or is it scanning through a data structure, so we’d be better off having a single but very large level? How often does it access data? How much would its performance suffer if we just let data drop to main memory? There are all these different tradeoffs.”

A 36-tile Jenga system that’s running four applications. Jenga gives each application a custom virtual cache hierarchy. (Source: Po-An Tsai, Nathan Beckmann, Daniel Sanchez)

Jenga evaluates the tradeoff between latency and space for two layers of cache simultaneously, which turns the two-dimensional latency-space curve into a three-dimensional surface. Fortunately, that surface turns out to be fairly smooth: It may undulate, but it usually won’t have sudden, narrow spikes and dips.

That means that sampling points on the surface will give a pretty good sense of what the surface as a whole looks like. The researchers developed a sampling algorithm tailored to the problem of cache allocation, which systematically increases the distances between sampled points.

Once it has deduced the shape of the surface, Jenga finds the path across it that minimizes latency. Then it extracts the component of that path contributed by the first level of cache, which is a 2-D curve. At that point, performs space-allocation, which can be updated every 100 milliseconds.

Jenga also features a data-placement procedure for DRAM cache to reduce bottlenecks when multiple cores are retrieving data. After Jenga has come up with a set of cache assignments, cores don’t simply dump all their data into the nearest available memory bank. Instead, Jenga parcels out the data a little at a time, then estimates the effect on bandwidth consumption and latency. Thus, even within the 100-millisecond intervals between chip-wide cache re-allocations, Jenga adjusts the priorities that each core gives to the memory banks allocated to it.

Spintronic switch
Researchers at Chalmers University of Technology demonstrated a graphene-based spin field-effect transistor operating at room temperature, a step towards integrating spintronic logic and memory devices.

Spintronic memory devices use the intrinsic properties of electron spin to store information. For future devices, researchers are searching for ways to integrate both information processing and storage in one device unit.

Graphene is capable of conveying electrons with coordinated spin over longer distances and preserving the spin for a longer time than any other known material at room temperature.

“Graphene is an excellent medium for spin transport at room temperature, due to its low atomic mass. However, an unsolved challenge was to control the spin current at ambient temperature” explains Saroj Dash, Associate Professor at Chalmers.

Graphene has been shown to transport spin over long distances, and combining it with another layered material where spin lasts much less time can produce a spin field-effect transistor like device.

“By combining graphene, where spin lasts for nano seconds with molybdenum disulfide where spin only lasts for picoseconds you can control where the spin can go by using a gate voltage – essentially you can create a spin switch. Importantly, we show in this research a particular materials mix which enables this spin-switch to work at room temperature,” said Dash.

A schematic of graphene-MoS2 heterostructure which allows spin injection into graphene from the ferromagnetic source, diffusive spin transport in the graphene-MoS2 heterostructure channel, spin manipulation by a gate voltage and detection of spin signal by the ferromagnetic drain. (Source: Spin [email protected])

Dash is a little hesitant to call the device a spin transistor. “When researchers proposed on future spin transistors, they often imagined something based on semiconductor technology and so called coherent manipulation of electron spin. What we have done works in a completely different way, but performs a similar switching task.”

He does note, however, that “it points to a multifunctional component that can handle both data storage and processor work – in a single unit.”

Next, the team will work on optimizations to increase the effective gain and transistor action.

Low-power temperature sensor
Electrical engineers at the University of California San Diego developed a temperature sensor that runs on only 113 picowatts of power — 628 times lower power than state-of-the-art temperature sensors, which consume tens of nanowatts. The team envisions the sensor as part of wearable and IoT systems that require only a tiny battery or scavenged power.

To reduce power, the researchers focused on the current source and the conversion of temperature to a digital readout.

Researchers built an ultra-low power current source using gate leakage transistors, in which tiny levels of current leak through the gate. “Many researchers are trying to get rid of leakage current, but we are exploiting it to build an ultra-low power current source,” said Hui Wang, an electrical engineering Ph.D. student at UC San Diego.

Using these current sources, researchers developed a less power-hungry way to digitize temperature. This process normally requires passing current through a resistor — its resistance changes with temperature — then measuring the resulting voltage, and then converting that voltage to its corresponding temperature using an analog to digital converter.

Instead, the team developed an innovative system to digitize temperature directly and save power. Their system consists of two ultra-low power current sources: one that charges a capacitor in a fixed amount of time regardless of temperature, and one that charges at a rate that varies with temperature — slower at lower temperatures, faster at higher temperatures.

As the temperature changes, the system adapts so that the temperature-dependent current source charges in the same amount of time as the fixed current source. A built-in digital feedback loop equalizes the charging times by reconnecting the temperature-dependent current source to a capacitor of a different size — the size of this capacitor is directly proportional to the actual temperature. For example, when the temperature falls, the temperature-dependent current source will charge slower, and the feedback loop compensates by switching to a smaller capacitor, which dictates a particular digital readout.

The temperature sensor is integrated into a small chip measuring 0.15 × 0.15 square millimeters in area. (Source: David Baillot/UC San Diego Jacobs School of Engineering)

The temperature sensor is integrated into a small chip measuring 0.15 × 0.15 square millimeters in area. It operates at temperatures ranging from minus 20 C to 40 C. Its performance is fairly comparable to that of the state of the art even at near-zero-power, researchers said. One tradeoff is that the sensor has a response time of approximately one temperature update per second, which is slightly slower than existing temperature sensors. However, this response time is sufficient for devices that operate in the human body, homes and other environments where temperatures do not fluctuate rapidly, researchers said.

Moving forward, the team is working to improve the accuracy of the temperature sensor. The team is also optimizing the design so that it can be successfully integrated into commercial devices.