ICML 2023

Papers

ICML 2023이 열리고 있다.
Distillation, Quantization, HW-aware Deep Learning 위주로 트래킹 중이다.

  1. COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
    Jinqi Xiao, Miao Yin, Yu Gong, Xiao Zang, Jian Ren, Bo Yuan

  2. DIVISION: Memory Efficient Training via Dual Activation Precision
    Guanchu Wang, Zirui Liu, Zhimeng Jiang, Ninghao Liu, Na Zou, Xia Hu

  3. Fast Private Kernel Density Estimation via Locality Sensitive Quantization
    Tal Wagner, Yonatan Naamad, Nina Mishra

  4. NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning
    Tianxin Wei, Zeming Guo, Yifan Chen, Jingrui He

  5. On the Impact of Knowledge Distillation for Model Interpretability
    Hyeongrok Han, Siwon Kim, Hyun-Soo Choi, Sungroh Yoon

  6. Quantized Distributed Training of Large Models with Convergence Guarantees
    Ilia Markov, Adrian Vladu, Qi Guo, Dan Alistarh

  7. Random Teachers are Good Teachers
    Felix Sarnthein, Gregor Bachmann, Sotiris Anagnostidis, Thomas Hofmann

  8. Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks
    Minyoung Huh, Brian Cheung, Pulkit Agrawal, Phillip Isola

  9. The case for 4-bit precision: k-bit Inference Scaling Laws
    Tim Dettmers, Luke Zettlemoyer

  10. Fast Inference from Transformers via Speculative Decoding
    Yaniv Leviathan, Matan Kalman, Yossi Matias

  11. Scaling Vision Transformers to 22 Billion Parameters
    Mostafa Dehghani et al

  12. BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
    Taebum Kim, Hyoungjoo Kim, Gyeong-In Yu, Byung-Gon Chun

  13. Brainformers: Trading Simplicity for Efficiency
    Yanqi Zhou et al

  14. Cramming: Training a Language Model on a Single GPU in One Day
    Jonas Geiping, Tom Goldstein

  15. LookupFFN: Making Transformers Compute-lite for CPU inference
    Zhanpeng Zeng, Michael Davies, Pranav Pulijala, Karthikeyan Sankaralingam, Vikas Singh

  16. Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
    Xunyi Zhao, Théotime Le Hellard, Lionel Eyraud, Julia Gusak, Olivier Beaumont

  17. SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
    Mahdi Nikdan, Tommaso Pegolotti, Eugenia Iofinova, Eldar Kurtic, Dan Alistarh

  18. Understanding INT4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases
    Xiaoxia Wu, Cheng Li, Reza Yazdani Aminabadi, Zhewei Yao, Yuxiong He

  19. Are Large Kernels Better Tearchers than Transformers for ConvNets?
    Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu

Workshops

  1. Knowledge and Logical Reasoning in the Era of Data-driven Learning
  2. Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities
  3. Challenges of Deploying Generative AI
  4. Efficient Systems for Foundation Models
  5. Neural Compression

추가로 살펴보면 좋은 레퍼런스들

[1] ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
[2] Backprop with Approximate Activations for Memory-efficient Network Training
[3] AC-GC: Lossy Activation Compression with Guaranteed Convergence
[4] More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity
[5] A ConvNet for the 2020s
[7]] Dead Pixel Test Using Effective Receptive Field
[8] Demystify Transformers & Convolutions in Modern Image Deep Networks
[9] Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs
[10] Neural Tangent Kernel: Convergence and Generalization in Neural Networks