Publications
2025
Wonsuk Jang, and Thierry Tambe. BlockDialect: Block-wise Fine-grained Mixed Format for Energy-Efficient LLM Inference. To appear at International Conference on Machine Learning (ICML), 2025.
Peijing Li, Matthew Hung, Yiming Tan, Konstantin Hoßfeld, Jake Jiajun Cheng, Shuhan Liu, Lixian Yan, Xinxin Wang, H-S. Philip Wong, and Thierry Tambe. GainSight: Application-Guided Profiling for Composing Heterogeneous On-Chip Memories in AI Hardware Accelerators. arXiv:2504.14866, 2025.
Yasmine Omri, Parth Shroff, and Thierry Tambe. Token Sequence Compression for Efficient Multimodal Computing. 2nd edition of efficient Large Vision Models workshop (eLVM), 2025.
2024
Sai Qian Zhang*, Thierry Tambe*, David Brooks, and Gu-Yeon Wei. CAMEL: Co-Designing AI Models and eDRAMs for Efficient On-Device Learning. 2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Edinburgh, United Kingdom, 2024.
Maico Cassel Dos Santos, Tianyu Jia, Joseph Zuckerman, Martin Cochet, Davide Giri, Erik Jens Loscalzo, Karthik Swaminathan, Thierry Tambe, Jeff Jun Zhang, Alper Buyuktosunoglu, Kuan-Lin Chiu, Giuseppe Di Guglielmo, Paolo Mantovani, Luca Piccolboni, Gabriele Tombesi, David Trilla, John-David Wellman, En-Yu Yang, Aporva Amarnath, Ying Jing, Bakshree Mishra, Joshua Park, Vignesh Suresh, Sarita Adve, Pradip Bose, David Brooks, Luca P. Carloni, Kenneth L. Shepard, Gu-Yeon Wei. A 12nm Linux-SMP-Capable RISC-V SoC with 14 Accelerator Types, Distributed Hardware Power Management and Flexible NoC-Based Data Orchestration. 2024 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2024.
Sai Qian Zhang, Thierry Tambe, Gu-Yeon Wei, and David Brooks. JointNF: Enhancing DNN Performance through Adaptive N:M Pruning across both Weight and Activation. In Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED). Association for Computing Machinery, New York, NY, USA, 1–6, 2024.
Bo-Yuan Huang, Steven Lyubomirsky, Yi Li, Mike He, Gus Henry Smith, Thierry Tambe, Akash Gaonkar, Vishal Canumalla, Andrew Cheung, Gu-Yeon Wei, Aarti Gupta, Zachary Tatlock, and Sharad Malik. Application-level Validation of Accelerator Designs Using a Formal Software/Hardware Interface. ACM Trans. Des. Autom. Electron (TODAES). Syst. 29, 2, Article 35, 25 pages. 2024.
2023
Yu-Shun Hsiao, Siva Hari, Balakumar Sundaralingam, Jason Yik, Thierry Tambe, Charbel Sakr, Stephen Keckler, and Vijay Janapa Reddi. VaPr: Variable-Precision Tensors to Accelerate Robot Motion Planning. 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 6304-6309, 2023
Thierry Tambe, Jeff Zhang, Coleman Hooper, Tianyu Jia, Paul N. Whatmough, Joseph Zuckerman, Maico Cassel Dos Santos, Erik Jens Loscalzo, Davide Giri, Kenneth Shepard, Luca Carloni, Alexander Rush, David Brooks, and Gu-Yeon Wei. A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management. 2023 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023.
Thierry Tambe, En-Yu Yang, Glenn G. Ko, Yuji Chai, Coleman Hooper, Marco Donato, Paul N. Whatmough, Alexander M. Rush, David Brooks, and Gu-Yeon Wei. A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs. in IEEE Journal of Solid-State Circuits (JSSC), vol. 58, no. 2, pp. 569-581, Feb. 2023.
2022
Abdulrahman Mahmoud, Thierry Tambe, Tarek Aloui, David Brooks, and Gu-Yeon Wei. GoldenEye: A Platform for Evaluating Emerging Numerical Data Formats in DNN Accelerators. 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Baltimore, MD, USA, 2022.
Cheng Tan, Thierry Tambe, Jeff Zhang, Bo Fang, Tong Geng, Gu-Yeon Wei, David Brooks, Antonino Tumeo, Ganesh Gopalakrishnan, and Ang Li. 2022. ASAP: automatic synthesis of area-efficient and precision-aware CGRAs. In Proceedings of the 36th ACM International Conference on Supercomputing (ICS). Association for Computing Machinery, New York, NY, USA, 2022.
2021
Thierry Tambe, Coleman Hooper, Lillian Pentecost, Tianyu Jia, En-Yu Yang, Marco Donato, Victor Sanh, Paul Whatmough, Alexander M. Rush, David Brooks, and Gu-Yeon Wei. 2021. EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Association for Computing Machinery, New York, NY, USA, 2021.
Sabrina M. Neuman, Brian Plancher, Thomas Bourgeat, Thierry Tambe, Srinivas Devadas, and Vijay Janapa Reddi. 2021. Robomorphic computing: a design methodology for domain-specific accelerators parameterized by robot morphology. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Association for Computing Machinery, New York, NY, USA, 2021.
Thierry Tambe, En-Yu Yang, Glenn G. Ko, Yuji Chai, Coleman Hooper, Marco Donato, Paul N. Whatmough, Alexander M. Rush, David Brooks, and Gu-Yeon Wei. A 25mm2 SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET. 2021 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021.
2020
Glenn Ko, Yuji Chai, Marco Donato, Paul N. Whatmough, Thierry Tambe, Rob A. Rutenbar, Gu-Yeon Wei, and David Brooks. A Scalable Bayesian Inference Accelerator for Unsupervised Learning. 2020 IEEE Hot Chips 32 Symposium (Hot Chips), Palo Alto, CA, USA, 2020.
Thierry Tambe, En-Yu Yang, Zishen Wan, Yuntian Deng, Vijay Janapa Reddi, Alexander Rush, David Brooks, and Gu-Yeon Wei. Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference. 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 2020.
Glenn G. Ko, Yuji Chai, Marco Donato, Paul N. Whatmough, Thierry Tambe, Rob A. Rutenbar, David Brooks, and Gu-Yeon Wei. A 3mm2 Programmable Bayesian Inference Accelerator for Unsupervised Machine Perception using Parallel Gibbs Sampling in 16nm. 2020 IEEE Symposium on VLSI Circuits (VLSI), Honolulu, HI, USA, 2020.
2019
Udit Gupta, Brandon Reagen, Lillian Pentecost, Marco Donato, Thierry Tambe, Alexander M. Rush, Gu-Yeon Wei, and David Brooks. MASR: A Modular Accelerator for Sparse RNNs. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE Press, 1–14. 2019.