ICML 2023
Papers
ICML 2023이 열리고 있다.
Distillation, Quantization, HW-aware Deep Learning 위주로 트래킹 중이다.
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Jinqi Xiao, Miao Yin, Yu Gong, Xiao Zang, Jian Ren, Bo YuanDIVISION: Memory Efficient Training via Dual Activation Precision
Guanchu Wang, Zirui Liu, Zhimeng Jiang, Ninghao Liu, Na Zou, Xia HuFast Private Kernel Density Estimation via Locality Sensitive Quantization
Tal Wagner, Yonatan Naamad, Nina MishraNTK-approximating MLP Fusion for Efficient Language Model Fine-tuning
Tianxin Wei, Zeming Guo, Yifan Chen, Jingrui HeOn the Impact of Knowledge Distillation for Model Interpretability
Hyeongrok Han, Siwon Kim, Hyun-Soo Choi, Sungroh YoonQuantized Distributed Training of Large Models with Convergence Guarantees
Ilia Markov, Adrian Vladu, Qi Guo, Dan AlistarhRandom Teachers are Good Teachers
Felix Sarnthein, Gregor Bachmann, Sotiris Anagnostidis, Thomas HofmannStraightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks
Minyoung Huh, Brian Cheung, Pulkit Agrawal, Phillip IsolaThe case for 4-bit precision: k-bit Inference Scaling Laws
Tim Dettmers, Luke ZettlemoyerFast Inference from Transformers via Speculative Decoding
Yaniv Leviathan, Matan Kalman, Yossi MatiasScaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani et alBPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
Taebum Kim, Hyoungjoo Kim, Gyeong-In Yu, Byung-Gon ChunBrainformers: Trading Simplicity for Efficiency
Yanqi Zhou et alCramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping, Tom GoldsteinLookupFFN: Making Transformers Compute-lite for CPU inference
Zhanpeng Zeng, Michael Davies, Pranav Pulijala, Karthikeyan Sankaralingam, Vikas SinghRockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
Xunyi Zhao, Théotime Le Hellard, Lionel Eyraud, Julia Gusak, Olivier BeaumontSparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
Mahdi Nikdan, Tommaso Pegolotti, Eugenia Iofinova, Eldar Kurtic, Dan AlistarhUnderstanding INT4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases
Xiaoxia Wu, Cheng Li, Reza Yazdani Aminabadi, Zhewei Yao, Yuxiong HeAre Large Kernels Better Tearchers than Transformers for ConvNets?
Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu
Workshops
- Knowledge and Logical Reasoning in the Era of Data-driven Learning
- Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities
- Challenges of Deploying Generative AI
- Efficient Systems for Foundation Models
- Neural Compression
추가로 살펴보면 좋은 레퍼런스들
[1] ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
[2] Backprop with Approximate Activations for Memory-efficient Network Training
[3] AC-GC: Lossy Activation Compression with Guaranteed Convergence
[4] More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity
[5] A ConvNet for the 2020s
[7]] Dead Pixel Test Using Effective Receptive Field
[8] Demystify Transformers & Convolutions in Modern Image Deep Networks
[9] Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs
[10] Neural Tangent Kernel: Convergence and Generalization in Neural Networks