Loading...

高效能人工智慧系統

最後更新日期 2022-04-11
  • 課程分類:智慧聯網技術與應用人才培育計畫RISC-V課程發展計畫
  • 課程簡介:
  • 課程章節 :
    章節內容
     

    1. Overview of the Course: Why and How to Develop High Performance AI Systems with Hardware-Software Co-Design  -                             class1class2class3Overview of the Course
    2. Introduction to High Performance AI Systems  -                                                                                                                                                      class1class2class3class4class5class6
    3. Performance Analysis for Deep Learning Systems -                                                                                                                                                  class1class2class3class4class5class6class7
    4. Performance Analysis for Deep Learning Systems -    Profiling   Deep   Software Stacks with SOFA                                                              class1class2class3class4class5
    5. Discussion: Research Papers 

    參考資料 :
    1. Amdahl’s Law in the Datacenter Era: A Market for Fair Processor Allocation 
    這篇獲得HPCA 2018的Best Paper Award,探討HPC的未來,雖然與AI關係較遠,還是頗值得關注。

    2. Towards Pervasive and User Satisfactory CNN across GPU Microarchitecture
    這篇獲得HPCA 2017 best paper,探討跨GPU的CNN效能問題

    3. Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
    這是發表在SC 2018的論文,探討如何進行大規模的K-menas演算

    4. Anatomy Of High-Performance Deep Learning Convolutions On SIMD Architectures
    這是發表在SC 2018的論文,剖析在SIMD架構上深度學習的計算工作

    5. CosmoFlow: Using Deep Learning to Learn the Universe at Scale
    這是發表在SC 2018的論文,探討如何用深度學習來研究宇宙學

    6. Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search
    這是即將發表在DAC 2019的Best Paper Award Candidate,探討如何自動搜尋及優化在FPGA上的類神經網路

    7. RAPIDNN: In-Memory Deep Neural Network Acceleration Framework
    探討如何從記憶體設計的角度加速深度學習網路

    8. A Survey of Model Compression and Acceleration for Deep Neural Networks
    探討如何利用壓縮的方式加速深度學習網路

    9. A Survey of FPGA-Based Neural Network Inference Accelerator
    探討如何設計FPGA上的深度學習網路推論加速器

    10. TOWARDS FEDERATED LEARNING AT SCALE: SYSTEM DESIGN
    這是Google解釋他所推出的大規模分散式深度學習的運作方法

    11. SCALE-Sim: Systolic CNN Accelerator Simulator
    這篇討論如何設計出一套能模擬Systolic Array的工具

    12. DEEP COMPRESSION: COMPRESSING DEEP NEURAL NETWORKS WITH PRUNING, TRAINED QUANTIZATION AND HUFFMAN CODING
    這是較早提出利用壓縮的方式加速深度學習網路的經典論文

    13. Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
    這篇廣泛收集各種以平行/分散式的深度學習,能夠用功讀完的話,功力一定大增

    14. Performance Comparison of the Digital Neuromorphic Hardware SpiNNaker and the Neural Network Simulation Software NEST for a Full-Scale Cortical Microcircuit
    Model
    人工智慧不只有DNN,對於Spiking Neural Networks 有興趣的話,可以來研讀這篇論文

    15. Horovod: fast and easy distributed deep learning in TensorFlow
    這篇論文探討如何優化TensorFlow所建構的分散式深度學習

    16. In-Datacenter Performance Analysis of a Tensor Processing Unit
    這是Google發表在ISCA 2017,討論TPU架構和效能的論文

    17. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
    這篇是TensorFlow的出道之作,談論TensorFlow針對在異質分散式系統上進行大型機器學習的設計理念

    18. Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
    提出在大型GPU叢集上降低通訊的軟體架構


課程附件

TOP