Optimizing Hardware Capacity, Utilizing Automatic Differentiation to Efficiently Compute Derivatives in Parallel Programming Models


A technical paper titled "Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation" was published by researchers at MIT (CSAIL), Argonne National Lab, and TU Munich. The paper was a Best Paper Finalist and a Best Student Paper winner at SuperComputing 2022. Find the technical paper here. Published November 2022. The work "demonstrates how Enzyme opti... » read more

Profile-Guided HW/SW Mechanism To Efficiently Reduce Branch Mispredictions In Data Center Applications (Best Paper Award)


A new technical paper titled "Whisper: Profile-Guided Branch Misprediction Elimination for Data Center Applications" was published by researchers at University of Michigan, ARM, University of California, Santa Cruz, and Texas A&M University. This work was awarded a best paper award at October's 2022 Institute of Electrical and Electronics Engineers (IEEE)/Association for Computing Machin... » read more

Memory and Energy-Efficient Batch Normalization Hardware


A new technical paper titled "LightNorm: Area and Energy-Efficient Batch Normalization Hardware for On-Device DNN Training" was published by researchers at DGIST (Daegu Gyeongbuk Institute of Science and Technology). The work was supported by Samsung Research Funding Incubation Center. Abstract: "When training early-stage deep neural networks (DNNs), generating intermediate features via con... » read more

Multi-Bit In-Memory Computing System for HDC using FeFETs, Achieving SW-Equivalent-Accuracies


A new technical paper titled "Achieving software-equivalent accuracy for hyperdimensional computing with ferroelectric-based in-memory computing" by researchers at University of Notre Dame, Fraunhofer Institute for Photonic Microsystems, University of California Irvine, and Technische Universität Dresden. "We present a multi-bit IMC system for HDC using ferroelectric field-effect transistor... » read more

Scalable Optical AI Accelerator Based on a Crossbar Architecture


A new technical paper titled "Scalable Coherent Optical Crossbar Architecture using PCM for AI Acceleration" was published by researchers at University of Washington. Abstract: "Optical computing has been recently proposed as a new compute paradigm to meet the demands of future AI/ML workloads in datacenters and supercomputers. However, proposed implementations so far suffer from lack of sc... » read more

Using BDA To to Predict SAQP Pitch Walk


A new technical paper titled "Bayesian dropout approximation in deep learning neural networks: analysis of self-aligned quadruple patterning" was published by researchers at IBM TJ Watson Research Center and Rensselaer Polytechnic Institute. Find the technical paper here. Published November 2022.  Open Access. Scott D. Halle, Derren N. Dunn, Allen H. Gabor, Max O. Bloomfield, and Mark Sh... » read more

L-FinFET Neuron For A Highly Scalable Capacitive Neural Network (KAIST)


A new technical paper titled "An Artificial Neuron with a Leaky Fin-Shaped Field-Effect Transistor for a Highly Scalable Capacitive Neural Network" was published by researchers at KAIST (Korea Advanced Institute of Science and Technology). “In commercialized flash memory, tunnelling oxide prevents the trapped charges from escaping for better memory ability. In our proposed FinFET neuron, t... » read more

Using Silicon Photonics To Reduce Latency On Edge Devices


A new technical paper titled "Delocalized photonic deep learning on the internet’s edge" was published by researchers at MIT and Nokia Corporation. “Every time you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can pipe the program in from memory. Our pipe is massive — it corresponds to sending a full feature-leng... » read more

Active Learning to Reduce Data Requirements For Defect Identification in Semiconductor Manufacturing


A new technical paper titled "Exploring Active Learning for Semiconductor Defect Segmentation" was published by researchers at Agency for Science, Technology and Research (A*STAR) in Singapore. "We identify two unique challenges when applying AL on semiconductor XRM scans: large domain shift and severe class-imbalance. To address these challenges, we propose to perform contrastive pretrainin... » read more

Memory-Computation Decoupling Execution To Achieve Ideal All-Bank PIM Performance


A new technical paper titled "Achieving the Performance of All-Bank In-DRAM PIM With Standard Memory Interface: Memory-Computation Decoupling" was published by researchers at Korea University. "This paper proposed the memory-computation decoupled PIM architecture to provide the performance comparable to the all-bank PIM while preserving the standard DRAM interface, i.e., DRAM commands, powe... » read more

← Older posts Newer posts →