Machine Learning Hardware Reading Group
We meet every other Tuesday in Rhodes 471E at 11:15am. For announcements, subscribe to the mailing list by sending a message to firstname.lastname@example.org. Papers and comments are stored on the COECIS internal GitHub instance.
The schedule for this semester (Spring 2017):
: DNNs on FPGAs
led by Ritchie, Shreesha, and Yuan
DNN-on-Catapult talk: Toward Accelerating Deep Learning at Scale Using Specialized Hardware in the Datacenter. Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, and Eric S. Chung. In HotChips 2015. (And an accompanying whitepaper.)
DNNWeaver: From High-Level Deep Neural Models to FPGAs. Hardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, and Hadi Esmaeilzadeh. In MICRO 2016.
: ISAs for neural accelerators
led by Shreesha
Cambricon: An Instruction Set Architecture for Neural Networks. Shaoli Liu, Zidong Du, Jinhua Tao, Dong Han, Tao Luo, Yuan Xie, Yunji Chen, and Tianshi Chen. In ISCA 2016.
A Case for Neuromorphic ISAs. Atif Hashmi, Andrew Nere, James Jamal Thomas, and Mikko Lipasti. In ASPLOS 2011.
: Hardware for Halide-like languages
led by Berkin and Ritchie
Rigel: Flexible Multi-Rate Image Processing Hardware. James Hegarty, Ross Daly, Zachary DeVito, Jonathan Ragan-Kelley, Mark Horowitz, and Pat Hanrahan. In SIGGRAPH 2016.
Secondarily, "Halide-HLS": Programming Heterogeneous Systems from an Image Processing DSL. Jing Pu, Steven Bell, Xuan Yang, Jeff Setter, Stephen Richardson, Jonathan Ragan-Kelley, and Mark Horowitz. On arXiv.
led by Mark and Ben
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer.
As background (we read this last semester): Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. Song Han, Huizi Mao, and William J. Dally.
led by Ritchie
Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. Yu-Hsin Chen, Tushar Krishna, Joel S. Emer, and Vivienne Sze. In ISSCC 2016.
: Generative Adversarial Nets (GANs) and TPU
led by Skand and Ritchie
In-Datacenter Performance Analysis of a Tensor Processing Unit. Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, Doe Hyun Yoon. Preprint, to appear in ISCA 2017.
NIPS tutorial by Ian Goodfellow:
Short description in KDnuggets.
Step-by-step introduction on Medium. (Skip to "How DCGANs Work.")
Ritchie recommends reading the blog posts and glancing at the paper results.
: Analog and ReRAM
led by Mark and Berkin
Memristive Boltzmann Machine: A Hardware Accelerator for Combinatorial Optimization and Deep Learning. Mahdi Nazm Bojnordi and Engin Ipek. In HPCA 2016.
Comparing Stochastic and Deterministic Computing. Rajit Manohar. In CAL, March 2015.
You can subscribe to a calendar for this schedule. And you can edit the schedule on GitHub.