Archived Schedules
Fall 2017
-
: Organization
led by AdrianSelecting papers for the semester.
-
: Probabilistic Graphical Models
led by SkandAccelerating Markov Random Field Inference Using Molecular Optical Gibbs Sampling Units. Siyang Wang, Xiangyu Zhang, Yuxuan Li, Ramin Bashizade, Song Yang, Chris Dwyer, and Alvin R. Lebeck. ISCA 2016.
High throughput Bayesian computing machine with reconfigurable hardware. Mingjie Lin, Ilia Lebedev, and John Wawrzynek. FPGA 2010.
-
: Datacenters
led by BenSpecial MICRO 2017 preview edition!
Scale-Out Acceleration for Machine Learning. Jongse Park, Hardik Sharma, Divya Mahajan, Joon Kyung Kim, and Hadi Esmaeilzadeh. MICRO 2017.
DeftNN: Addressing Bottlenecks for DNN Execution on GPUs via Synapse Vector Elimination and Near-compute Data Fission. Parker Hill, Animesh Jain, Mason Hill, Babak Zamirai, Chang-Hong Hsu, Michael Laurenzano, Scott Mahlke, Lingjia Tang, and Jason Mars. MICRO 2017.
And Amazon DSSTNE, if we can find something useful to read about it.
-
: Sparsity
led by MarkSCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. ISCA 2017.
Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism. Jiecao Yu, Andrew Lukefahr, David Palframan, Ganesh Dasika, Reetuparna Das, and Scott Mahlke. ISCA 2017.
Sigma Delta Quantized Networks. Peter O'Connor and Max Welling. ICLR 2017.
-
: FPGAs
led by Shreesha (and Ritchie)Fused-layer CNN accelerators. Manoj Alwani, Han Chen, Michael Ferdman, and Peter Milder. MICRO 2016.
Accelerating persistent neural networks at datacenter scale.. Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengil, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Christian Boehn, Oren Firestein, Alessandro Forin, Kang Su Gatlin, Mahdi Ghandi, Stephen Heil, Kyle Holohan, Tamas Juhasz, Ratna Kumar Kovvuri, Sitaram Lanka, Friedel van Megen, Dima Mukhortov, Prerak Patel, Steve Reinhardt, Adam Sapek, Raja Seera, Balaji Sridharan, Lisa Woods, Phillip Yi-Xiao, Ritchie Zhao, Doug Burger. HotChips 2017 (slides).
-
: Numerical Tricks
led by RitchieCirCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices. Caiwen Ding, Siyu Liao, Yanzhi Wang, Zhe Li, Ning Liu, Youwei Zhuo, Chao Wang, Xuehai Qian, Yu Bai, Geng Yuan, Xiaolong Ma, Yipeng Zhang, Jian Tang, Qinru Qiu, Xue Lin, and Bo Yuan. MICRO 2017.
An OpenCL Deep Learning Accelerator on Arria 10. Utku Aydonat, Shane O'Connell, Davor Capalija, Andrew C. Ling, and Gordon R. Chiu. arXiv.
-
: Learning
led by Chris & SkandUnderstanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent. Christopher De Sa, Matthew Feldman, Christopher Ré, and Kunle Olukotun. ISCA 2017.
Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation. Benjamin Scellier and Yoshua Bengio. arXiv.
Spring 2017
-
: DNNs on FPGAs
led by Ritchie, Shreesha, and YuanDNN-on-Catapult talk: Toward Accelerating Deep Learning at Scale Using Specialized Hardware in the Datacenter. Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, and Eric S. Chung. In HotChips 2015. (And an accompanying whitepaper.)
DNNWeaver: From High-Level Deep Neural Models to FPGAs. Hardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, and Hadi Esmaeilzadeh. In MICRO 2016.
-
: ISAs for neural accelerators
led by ShreeshaCambricon: An Instruction Set Architecture for Neural Networks. Shaoli Liu, Zidong Du, Jinhua Tao, Dong Han, Tao Luo, Yuan Xie, Yunji Chen, and Tianshi Chen. In ISCA 2016.
A Case for Neuromorphic ISAs. Atif Hashmi, Andrew Nere, James Jamal Thomas, and Mikko Lipasti. In ASPLOS 2011.
-
: Hardware for Halide-like languages
led by Berkin and RitchieRigel: Flexible Multi-Rate Image Processing Hardware. James Hegarty, Ross Daly, Zachary DeVito, Jonathan Ragan-Kelley, Mark Horowitz, and Pat Hanrahan. In SIGGRAPH 2016.
Secondarily, "Halide-HLS": Programming Heterogeneous Systems from an Image Processing DSL. Jing Pu, Steven Bell, Xuan Yang, Jeff Setter, Stephen Richardson, Jonathan Ragan-Kelley, and Mark Horowitz. On arXiv.
-
: SqueezeNet
led by Mark and BenSqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer.
As background (we read this last semester): Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. Song Han, Huizi Mao, and William J. Dally.
-
: Eyeriss
led by RitchieEyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. Yu-Hsin Chen, Tushar Krishna, Joel S. Emer, and Vivienne Sze. In ISSCC 2016.
-
: Generative Adversarial Nets (GANs) and TPU
led by Skand and RitchieIn-Datacenter Performance Analysis of a Tensor Processing Unit. Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, Doe Hyun Yoon. Preprint, to appear in ISCA 2017.
NIPS tutorial by Ian Goodfellow:
Short description in KDnuggets.
Step-by-step introduction on Medium. (Skip to "How DCGANs Work.")
Papers:
Ritchie recommends reading the blog posts and glancing at the paper results.
-
: Analog and ReRAM
led by Mark and BerkinMemristive Boltzmann Machine: A Hardware Accelerator for Combinatorial Optimization and Deep Learning. Mahdi Nazm Bojnordi and Engin Ipek. In HPCA 2016.
Comparing Stochastic and Deterministic Computing. Rajit Manohar. In CAL, March 2015.
Spring 2018
-
: Organization
led by AdrianSelecting papers for the semester.
-
: Systems & Datacenters
led by MarkApplied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, Xiaodong Wang. HPCA 2018.
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training. Yujun Lin, Song Han, Huizi Mao, Yu Wang, William J. Dally. ICLR 2018.
-
: Precision
led by RitchieBit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks. Hardik Sharma, Jongse Park, Naveen Suda, Liangzhen Lai, Benson Chau, Joon Kyung Kim, Vikas Chandra, Hadi Esmaeilzadeh. Unpublished.
Towards Accurate Binary Convolutional Neural Network. Xiaofan Lin, Cong Zhao, Wei Pan. NIPS 2017.
-
: PIM
led by SkandPrimary: Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes. Erfan Azarkhish, Davide Rossi, Igor Loi, Luca Benini. IEEE TPDS, in 2018.
Secondary: TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory. Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, and Christos Kozyrakis. ASPLOS 2017.
Backup: Neurocube: a programmable digital neuromorphic architecture with high-density 3D memory. Duckhwan Kim, Jaeha Kung, Sek Chai, Sudhakar Yalamanchili, Saibal Mukhopadhyay. ISCA 2016.
-
: Large Minibatch Training
led by BenPrimary: Train longer, generalize better: closing the generalization gap in large batch training of neural networks. Elad Hoffer, Itay Hubara, Daniel Soudry. NIPS 2017
Secondary: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. Priya Goyal, Piotr Dolla ́r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, Kaiming He. FAIR Technical Report
Bonus: ImageNet Training in Minutes. Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer. Unpublished.
-
: Mobile
led by Phil & SkandGlimpse: A Programmable Early-Discard Camera Architecture for Continuous Mobile Vision. Saman Naderiparizi, Pengyu Zhang, Matthai Philipose, Bodhi Priyantha, Jie Liu, and Deepak Ganesan. MobiSys 2017.
TVM: End-to-End Optimization Stack for Deep Learning. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Haichen Shen, Eddie Yan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy. Unpublished.
Mobile Machine Learning Hardware at ARM: An SoC Perspective. Yuhao Zhu, Matthew Mattina, Paul Whatmough. SysML 2018.
-
: ISCA Preview
led by SachilleThe Dark Side of DNN Pruning. Reza Yazdani, Marc Riera, Jose-Maria Arnau, and Antonio Gonzalez. In ISCA 2018.
SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks. Vahideh Akhlaghi, Amir Yazdanbakhsh, Kambiz Samadi, Hadi Esmaeilzadeh, and Rajesh K. Gupta. In ISCA 2018.
Backup: An Energy-Efficient Neural Network Accelerator based on Outlier-Aware Low Precision Computation. Eunhyeok Park, Dongyoung Kim, and Sungjoo Yoo.
Fall 2019
-
: Organization
led by SachilleSelecting papers for the semester.
-
: Architecture performance
led by SachilleDiscerning the dominant out-of-order performance advantage: is it speculation or dynamism?, Daniel S. McFarlin, Charles Tucker and Craig Zilles, ASPLOS 2013.
-
: CGRAs
led by PeitianPlasticine: A reconfigurable architecture for parallel patterns, Raghu Prabhakar, Yaqi Zhang, David Koeplinger, Matt Feldman, Tian Zhao, Stefan Hadjis, Ardavan Pedram, Christos Kozyrakis and Kunle Olukotun, ISCA 2017.
-
: GPUs
led by TuanCORF: Coalescing Operand Register File for GPUs, Hodjat Asghari Esfeden, Farzad Khorasani, Hyeran Jeon, Daniel Wong and Nael Abu-Ghazaleh, ASPLOS 2019.
-
: Accelerators
led by HelenaQ100: The Architecture and Design of a Database Processing Unit, Lisa Wu, Andrea Lottarini, Timothy K. Paine, Martha A. Kim and Kenneth A. Ross, ASPLOS 2014.
-
: Fall break
led by -
: Cloud
led by YiTPShare: A Time-Space Sharing Scheduling Abstraction for Shared Cloud via Vertical Labels, Yuzhao Wang, Lele Li, You Wu, Junqing Yu, Zhibin Yu and Xuehai Qian, ISCA 2019.
-
: Verification
led by DebjitAutomated, compositional and iterative deadlock detection, Sagar Chaki, Edmund Clarke, Joe ̈l Ouaknine and Natasha Sharygina, MEMOCODE 2004. Alternative link
-
: Soft control and power
led by StevenAn integrated design and fabrication strategy for entirely soft, autonomous robots, Michael Wehner, Ryan L. Truby, Daniel J. Fitzgerald, Bobak Mosadegh, George M. Whitesides, Jennifer A. Lewis & Robert J. Wood, Nature 2016.
-
: Machine learning
led by NeilEager Pruning: Algorithm and Architecture Support for Fast Training of Deep Neural Networks, Jiaqi Zhang, Xiangru Chen, Mingcong Song and Tao Li, ISCA 2019.
-
: No Meeting
led by -
: No Meeting
led by -
: DSLs
led by Nitish & SachilleDomain-specific Languages and Code Synthesis Using Haskell, Andy Gill, ACM-Queue 2014.
-
: TBD
led by
Spring 2019
-
: Organization
led by SachilleSelecting papers for the semester.
-
: Microarchitecture
led by NitishDynamically Specialized Datapaths for Energy Efficient Computing Venkatraman Govindaraju, Chen-Han Ho, Karthikeyan Sankaralingam. 2011 HPCA.
Triggered Instructions: A Control Paradigm for Spatially-Programmed Architectures Angshuman Parashar, Michael Pellauer, Michael Adler, Bushra Ahsan, Neal Crago, Daniel Lustig, Vladimir Pavlov, Antonia Zhai, Mohit Gambhir, Aamer Jaleel, Randy Allmon, Rachid Rayess, Stephen Maresh, Joel Emer. 2013 ISCA.
-
: Languages & Compiler
led by Sachille & NitishPrimary: Spatial: A Language and Compiler for Application Accelerators David Koeplinger, Matthew Feldman, Raghu Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, Ardavan Pedram, Christos Kozyrakis, Kunle Olukotun. 2018 PLDI.
Secondary: Rethinking the Memory Hierarchy for Modern Languages Po-An Tsai, Yee Ling Gan, Daniel Sanchez. 2018 MICRO.
-
: Processing-in-memory
led by KailinDRISA: A DRAM-based Reconfigurable In-Situ Accelerator Shuangchen Li, Dimin Niu, Krishna T. Malladi, Hongzhong Zheng, Bob Brennan, Yuan Xie. 2017 MICRO.
-
: Datacenters and Machine Learning
led by Neeraj -
: Machine Learning Accelerators
led by Nitish -
: Memory
led by YiAdaptive Scheduling for Systems with Asymmetric Memory Hierarchies Po-An Tsai, Changping Chen, Daniel Sanchez. 2018 MICRO.
-
: Processing-in-memory
led by KhalidCompute Caches Shaizeen Aga, Supreet Jeloka, Arun Subramaniyan, Satish Narayanasamy, David Blaauw, Reetuparna Das. 2017 HPCA.
Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks Charles Eckert, Xiaowei Wang, Jingcheng Wang, Arun Subramaniyan, Ravi Iyer, Dennis Sylvester, David Blaauw, Reetuparna Das. 2018 ISCA.
-
: ASPLOS Preview
led by SachilleTangram: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators Mingyu Gao, Xuan Yang, Jing Pu, Mark Horowitz, Christos Kozyrakis. 2019 ASPLOS.
-
: Spring Break
led byNo meeting
-
: ASPLOS Preview
led by CunxiJust-In-Time Compilation for Verilog — A New Technique for Improving the FPGA Programming Experience Eric Schkufza, Michael Wei, Christopher J. Rossbach. 2019 ASPLOS.
-
: ASPLOS
led byNo meeting
-
: Machine Learning Systems
led by RitchieRethinking floating point for deep learning Jeff Johnson. 2018 NeurIPS.
-
: DRISA V2
led by HelenaSCOPE: A Stochastic Computing Engine for DRAM-Based In-Situ Accelerator Shuangchen Li, Alvin Oliver Glova, Xing Hu, Peng Gu, Dimin Niu, Krishna T. Malladi, Hongzhong Zheng, Bob Brennan, Yuan Xie. 2018 MICRO.
-
: Virtual memory
led by SungboHawkEye: Efficient Fine-grained OS Support for Huge Pages Ashish Panwar, Sorav Bansal, K. Gopinath. 2019 ASPLOS.