Machine Learning Hardware Reading Group
For announcements, subscribe to the mailing list by sending a message to firstname.lastname@example.org. Papers and comments are stored on the COECIS internal GitHub instance.
You can subscribe to a calendar for this schedule. And you can edit the schedule on GitHub.
Here’s the schedule for the current semester, Spring 2018. We meet every other Tuesday in Rhodes 471E at 11:15am. You can also see archived semesters.
led by Adrian
Selecting papers for the semester.
Systems & Datacenters
led by Mark
Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, Xiaodong Wang. HPCA 2018.
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training. Yujun Lin, Song Han, Huizi Mao, Yu Wang, William J. Dally. ICLR 2018.
led by Ritchie
Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks. Hardik Sharma, Jongse Park, Naveen Suda, Liangzhen Lai, Benson Chau, Joon Kyung Kim, Vikas Chandra, Hadi Esmaeilzadeh. Unpublished.
Towards Accurate Binary Convolutional Neural Network. Xiaofan Lin, Cong Zhao, Wei Pan. NIPS 2017.
led by Skand
Primary: Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes. Erfan Azarkhish, Davide Rossi, Igor Loi, Luca Benini. IEEE TPDS, in 2018.
Secondary: TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory. Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, and Christos Kozyrakis. ASPLOS 2017.
Backup: Neurocube: a programmable digital neuromorphic architecture with high-density 3D memory. Duckhwan Kim, Jaeha Kung, Sek Chai, Sudhakar Yalamanchili, Saibal Mukhopadhyay. ISCA 2016.
Large Minibatch Training
led by Ben
Primary: Train longer, generalize better: closing the generalization gap in large batch training of neural networks. Elad Hoffer, Itay Hubara, Daniel Soudry. NIPS 2017
Secondary: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. Priya Goyal, Piotr Dolla ́r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, Kaiming He. FAIR Technical Report
Bonus: ImageNet Training in Minutes. Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer. Unpublished.
led by Phil & Skand
Glimpse: A Programmable Early-Discard Camera Architecture for Continuous Mobile Vision. Saman Naderiparizi, Pengyu Zhang, Matthai Philipose, Bodhi Priyantha, Jie Liu, and Deepak Ganesan. MobiSys 2017.
TVM: End-to-End Optimization Stack for Deep Learning. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Haichen Shen, Eddie Yan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy. Unpublished.
Mobile Machine Learning Hardware at ARM: An SoC Perspective. Yuhao Zhu, Matthew Mattina, Paul Whatmough. SysML 2018.
Structure & Metalearning
led by Sachille
Dynamic Optimization of Neural Network Structures Using Probabilistic Modeling. Shinichi Shirakawa, Yasushi Iwata, Youhei Akimoto. AAAI 2018.
Flexible Deep Neural Network Processing. Hokchhay Tann, Soheil Hashemi, Sherief Reda. Unpublished.
Online Deep Learning: Learning Deep Neural Networks on the Fly. Doyen Sahoo, Quang Pham, Jing Lu, Steven C.H. Hoi. Unpublished.