Computer Architecture

Publication title Authors
Date of publish
Further resources
Shredder: Learning Noise Distributions to Protect Inference Privacy P. Ramrakhyani
H. Esmaeilzadeh
F. Mireshghallah
 March 2020 Watch presentation
Fused: Closed-loop Performance and Energy Simulation of Embedded Systems S. T. Sliper
W. Wang
N. Nikoleris
A. S. Weddell
G. V. Merrett
January 2020  
Temporal Prefetching Without the Off-Chip Metadata H. Wu
K. Nathella
J. Pusdesris
D. Sunwoo
A. Jain
C. Lin
October 2019  
Directed Statistical Warming through Time Traveling N. Nikoleris
L. Eeckhout
E. Hagersten
T. E. Carlson
October 2019  
DynaSprint: Microarchitectural Sprints with Dynamic Utility and Thermal Management Z. Huang
J. Joao
A. Rico
A. Hilton
B. Lee
October 2019  
Reducing Data Movement and Energy in Multilevel Cache Hierarchies Without Losing Performance: Can you have it all? J. Wang
P. Ramrakhyani
W. Elsasser
L. K. John
September 2019  
Efficient Metadata Management for Irregular Data Prefetching H. Wu
K. Nathella
D. Sunwoo
A. Jain
C. Lin
June 2019  
Sampled Simulation of Task-Based Programs T. Grass
T. Carlson
A. Rico
G. Ceballos
E. Ayguadé
M. Casas
M. Moreto
February 2019  
BRB: Mitigating Branch Predictor Side-Channels I.Vougioukas
N.Nikoleris
A.Sandberg
S.Diestelhorst
G.V. Merrett
B.M. Al-Hashimi
February 2019 Read blog
Whitepaper: Introducing the new Armv8.1-M architecture T. Grocutt February 2019 Read blog
Nucleus: Finding the Sharing Limit of Heterogeneous Cores I. Vougioukas
A. Sanberg
B. M. Al-Hashimi
Geoff V. Merrett
October 2017  
A Triple Core Lock-Step (TCLS) ARM® Cortex®-R5 Processor for Safety-Critical and Ultra-Reliable Applications X. Iturbe
B. Venu
E. Ozer
S. Das
October 2017  
The Arm Scalable Vector Extension N. Stephens
S. Biles
M. Boettcher
J. Eapen
M. Eyole
G. Gabrielli
M. Horsnell
G. Magklis
A. Martinez
M. Premillieu
A. Reid
A. Rico
P. Walker
April 2017 Read blog

Devices, Circuits, and Materials

Publication title Authors
Date of publish
Further resources
Enhanced 3D Implementation of an Arm® Cortex®-A X. Xu
M. Bhargava
S. Moore
S. Sinha
B. Cline
July 2020  
Buried Power Rails and Back-side Power Grids: Arm® CPU Power Delivery Network Design Beyond 5nm D. Prasad
S. S. T. Nibhanupudi
S. Das
O. Zografos
B. Chehab
S. Sarkar
R. Baert
A. Robinson
A. Gupta
A. Spessot
P. Debacker
D. Verkest
J. Kulkarni
B. Cline
S. Sinha
December 2019 Read blog
Error Correlation Prediction in Lockstep Processors for Safety-Critical Systems E. Ozer
B. Venu
X. Iturbe
S. Das
S. Lyberis
J. Biggs
P. Harrod
J. Penton
October 2018 Read blog
Standard Cell Library Design and Optimization Methodology for ASAP7 PDK X. Xu
N. Shah
A. Evans
S. Sinha
B. Cline
G. Yeric
July 2018 Read blog

HPC

Publication title Authors
Date of publish
Further resources
Asvie: A Timing-Agnostic SVE Optimization Methodology M. T. Cruz
D. Ruiz
R. Rusitoru
November 2019  
Cache Line Sharing and Communication in ECP Proxy Applications J. Randall
A. Rico
J. A. Joao
September 2019 Read blog
On the Benefits of Tasking with OpenMP A. Rico
I. S. Barrera
J. A. Joao
J. Randall
M. Casas
M. Moretó
September 2019 Read blog
On the Maturity of Parallel Applications for Asymmetric Multi-core Processors K. Chronaki
M.Moretó
M. Casas
A. Rico
R. M. Badia
E. Ayguadé
M. Valero
May 2019  

IoT

Publication title Authors
Date of publish
Further resources
IOTGAZE: IoT Security Enforcement via Wireless Context Analysis T. Gu
Z. Fang
A. Abhishek
H. Fu
P. Hu
P. Mohapatra
July 2020  
Applications of Computation-In-Memory Architectures based on Memristive Devices Said Hamdioui
Hoang Anh Du Nguyen
Mottaqiallah Taouil
Abu Sebastian
Manuel Le Gallo
Sandeep Pande
Siebren Schaafsma
Francky Catthoor
Shidhartha Das
Fernando García-Redondo
G. Karunaratne
Abbas Rahimi
Luca Benini
March 2020  
M0N0: A Performance-Regulated 0.8-to-38MHz DVFS ARM Cortex-M33 SIMD MCU with 10nW Sleep Power Pranay Prabhat
Benoît Labbe
Graham Knight
Anand Savanth
Jonas Svedas
Matthew J Walker
Supreet Jeloka
Philex Ming-Yan Fan
Fernando García-Redondo
Thanusree Achuthan
James Myers
February 2020  
Analysis Demand Forecasting of Residential Energy Consumption at Multiple Time Scales P. Amin
L. Cherkasova
R. Aitken
V. Kache
November 2019  
A Time-Domain Current-Mode MAC Engine for Analogue Neural Networks in Flexible Electronics Matt Douthwaite
Fernando García-Redondo
Pantelis Georgiou
Shidhartha Das
October 2019  
A 0.98-nW/kHz 33-kHz Fully Integrated Subthreshold-Region Operation RC Oscillator With Forward-Body-Biasing P. Fan
A. Savanth
B. Labbé
P. Prabhat
J. Myers
September 2019 Read blog
A 10.8pJ/bit Pulse-Position Inductive Transceiver for Low-Energy Wireless 3D Integration B. Fletcher
S. Das
T. Mak
September 2019  
A Low-Energy Inductive Transceiver using Spike-Latency Encoding for Wireless 3D Integration B. Fletcher
S. Das
T. Mak
July 2019  
A 65nm switched source line sub-threshold ROM using data encoding, with 0.3V Vmin and 47fJ/b access energy Supreet Jeloka
Pranay Prabhat
Graham Knight
James Myers
July 2019  
Automating Energy Demand Modeling and Forecasting Using Smart Meter Data P. Amin
L. Cherkasova
R. Aitken
V. Kache
July 2019  
Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT Applications D. Gope
G. Dasika
M. Mattina
March 2019  
Integrated Reciprocal Conversion with Selective Direct Operation for Energy Harvesting Systems A. Savanth
A. Weddell
J. Myers
D. Flynn
B. M. Al-Hashimi
September 2017 Read blog

Machine Learning

Publication title Authors
Date of publish
Further resources
Mango: A Python Library for Parallel Hyperparameter Tuning S. Sandha
M. Aggarwal
I. Fedorov
M. Srivastava
May 2020  
Benchmarking TinyML Systems: Challenges and Direction C. Banbury
V. Reddi
M. Lam
W. Fu
A. Fazel
J. Holleman
X. Huang
R. Hurtado
D. Kanter
A. Lokhmotov
D. Patterson
D. Pau
J. Seo
J. Sieracki
U. Thakker
M. Verheist
P. Yadav
March 2020 Workshop
Compressing Language Models using Doped Kronecker Products U. Thakker
P. Whatmough
M. Mattina
J. Beu
March 2020 Workshop
Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference ZG. Liu
P. Whatmough
M. Mattina
March 2020  
Aggressive Compression of MobileNets Using Hybrid Ternary Layers D. Gope
J. Beu
U. Thakker
M. Mattina
February 2020  
Improving Accuracy of Neural Networks Compressed using Fixed Structures via Doping U. Thakker
P. Whatmough
M. Mattina
J. Beu
February 2020 Summit
Compressing RNNs for IoT devices by 15-38x using Kronecker Products U. Thakker
J. Beu
D. Gope
C. Zhou
I. Fedorov
G. Dasika
M. Mattina
January 2020  
Run-Time Efficient RNN Compression for Inference on Edge Devices U. Thakker
J. Beu
D. Gope
G. Dasika
M. Mattina
January 2020  
SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers I. Fedorov
R. P. Adams
M. Mattina
P. Whatmough
December 2019 Read blog
RNN Compression using Hybrid Matrix Decomposition U. Thakker
J. Beu
D. Gope
C. Zhou
I. Fedorov
G. Dasika
M. Mattina
December 2019 Workshop
Skipping RNN State Updates without Retraining the Original Model J. Tao
U. Thakker
G. Dasika
J. Beu
November 2019 Read blog
A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning N. Ardalani
U. Thakker
A. Albarghouthi
K. Sankaralingam
June 2019 Workshop
2019 Evolutionary Algorithms Review A. Sloss
S. Gustafson
June 2019 Read book
Learning low-precision neural networks without Straight-Through Estimator (STE) Z-G. Liu
M. Mattina
May 2019 Read blog
RNN Compression using Hybrid Matrix Decomposition U. Thakker
J. Beu
D. Gope
G. Dasika
M. Mattina
March 2019 Summit
Measuring scheduling efficiency of RNNs for NLP applications U. Thakker
G. Dasika
J. Beu
M. Mattina
March 2019 Workshop
SCALE-Sim: Systolic CNN Accelerator Simulator A. Samajdar
Y. Zhu
P. Whatmough
M. Mattina
T. Krishna
February 2019  
FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning P. Whatmough
C. Zhou
P. Hansen
S. Kolala Venkataramanaiah
J-S. Seo
M. Mattina
February 2019 Read blog
Energy Efficient Hardware for On-Device CNN Inference via Transfer Learning P. Whatmough
C. Zhou
P. Hansen
M. Mattina
February 2019  
Efficient and Robust Machine Learning for Real-World Systems F. Pernkopf
W. Roth
M. Zoehrer
L. Pfeifenberger
G. Schindler
H. Froening
S. Tschiatschek
R. Peharz
M. Mattina
Z. Ghahramani
December 2018  
DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications P. Whatmough
S. Kyu-Lee
D. Brooks
G. Wei
September 2018  
Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision Y. Zhu
A. Samajdar
M. Mattina
P. Whatmough
March 2018 Read blog
Mobile Machine Learning Hardware at Arm: A Systems-on-Chip (SoC) Perspective Y. Zhu
M. Mattina
P. Whatmough
February 2018  

Security

Publication title Authors
Date of publish
Further resources
ISA Semantics for ARMv8-A, RISC-V, and CHERI-MIPS A. Armstrong
T. Bauereiss
B. Campbell
A. Reid
K. E. Gray
R. M. Norton
P. Mundkur
M. Wassell
J. French
C. Pulte
S. Flur
I. Stark
N. Krishnaswami
P. Sewell
January 2019  
BRB: Mitigating Branch Predictor Side-Channels. I. Vougioukas
N. Nikoleris
A. Sandberg
S. Diestelhorst
B. M. Al-Hashimi
G. V. Merrett
November 2018 Read blog
The semantics of transactions and weak memory in x86, Power, ARM, and C++ N. Chong
T. Sorensen
J. Wickerson
June 2018 Read blog

Software and Services

Publication title Authors
Date of publish
Further resources
Challenges and Opportunities for Efficient Serverless Computing at the Edge P.K. Gadepalli
G. Peach
L. Cherkasova
R. Aitken
G. Parmer
October 2019  
Breaking Band: A Breakdown of High-performance Communication R. Zambre
M. Grodowitz
A. Chandramowlishwaran
P. Shamis
August 2019  
Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores A. Margaritov
S. Gupta
R. Gonzalez-Alberquilla
B. Grot
February 2019 Read blog
Open-Source Shared Memory implementation of the HPCG benchmark: analysis, improvements and evaluation on Cavium ThunderX2 D. Ruiz
F. Mantovani
M. Casas
F. Spiga
J. Labarta
July 2018  
Persistency for Synchronization-Free Regions V. Gogte
S. Diestelhorst
W. Wang
S. Narayanasamy
P. M. Chen
T. F. Wenisch
June 2018 Read blog
SynchroTrace: Synchronization-aware Architecture-agnostic Traces for Light-Weight Multicore Simulation of CMP and HPC Workloads K. Sangaiah
M. Lui
R. Jagtap
S. Diestelhorst
S. Nilakantan
A. More
B. Taskin
M. Hempstead
March 2018 Read blog
Crossing the Architectural Barrier: Evaluating Representative Regions of Parallel HPC Applications A. Ferrerón
R. Jagtap
S. Bischoff
R. Ruşitoru
March 2018 Read blog
Integrating DRAM Power-Down Modes in gem5 and Quantifying their Impact R. Jagtap
M. Jung
W. Elasser
C. Weis
A. Hansson
N. When
March 2018 Read blog