Heterogeneous system architecture a new compute platform infrastructure

Heterogeneous Systems Architecture - a new compute platform infrastructure presents a next-generation hardware platform, and associated software, that allows processors of different types to work efficiently and cooperatively in shared memory from a single source program. HSA also defines a virtual...

Descripción completa

Detalles Bibliográficos
Otros Autores: Hwu, Wen-mei, author (author), Hwu, Wen-mei W., editor (editor)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Amsterdam, [Netherlands] : Morgan Kaufmann 2016.
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629688606719
Tabla de Contenidos:
  • Front Cover
  • Heterogeneous System Architecture: A New Compute Platform Infrastructure
  • Copyright
  • Contents
  • Foreword
  • Preface
  • About the Contributing Authors
  • Chapter 1: Introduction
  • Chapter 2: HSA Overview
  • 2.1 A Short History of GPU Computing: The Problems That Are Solved by HSA
  • 2.2 The Pillars of HSA
  • 2.2.1 HSA Memory Model
  • 2.2.2 HSA Queuing Model
  • 2.2.3 HSAIL Virtual ISA
  • 2.2.4 HSA Context Switching
  • 2.3 The HSA Specifications
  • 2.3.1 HSA Platform System Architecture Specification
  • 2.3.2 HSA Runtime Specification
  • 2.3.3 HSA Programmer's Reference Manual -a.k.a. "HSAIL Spec"
  • 2.4 HSA Software
  • 2.5 The HSA Foundation
  • 2.6 Summary
  • Chapter 3: HSAIL - Virtual Parallel ISA
  • 3.1 Introduction
  • 3.2 Sample Compilation Flow
  • 3.3 HSAIL Execution Model
  • 3.4 A Tour of the HSAIL Instruction Set
  • 3.4.1 Atomic Operations
  • 3.4.2 Registers
  • 3.4.3 Segments
  • 3.4.4 Wavefronts and Lanes
  • 3.5 HSAIL Machine Models and Profiles
  • 3.6 HSAIL Compilation Flow
  • 3.7 HSAIL Compilation Tools
  • 3.7.1 Compiler Frameworks
  • 3.7.2 CL Offline Compilation (CLOC)
  • 3.7.3 HSAIL Assembler/Disassembler
  • 3.7.4 ISA and Machine Code Assembler/Disassembler
  • 3.8 Conclusion
  • Chapter 4: HSA Runtime
  • 4.1 Introduction
  • 4.2 The HSA Core Runtime API
  • 4.2.1 Runtime Initialization and Shutdown
  • 4.2.2 Runtime Notifications
  • 4.2.3 System and HSA Agent Information
  • 4.2.4 Signals
  • 4.2.5 Queues
  • 4.2.6 Architected Queuing Language
  • 4.2.7 Memory
  • 4.2.8 Code Objects and Executables
  • 4.3 HSA Runtime Extensions
  • 4.3.1 HSAIL Finalization
  • 4.3.2 Images and Samplers
  • 4.4 Conclusion
  • References
  • Chapter 5: HSA Memory Model
  • 5.1 Introduction
  • 5.2 HSA Memory Structure
  • 5.2.1 Segments
  • 5.2.2 Flat Addressing
  • 5.2.3 Shared Virtual Addressing
  • 5.2.4 Ownership
  • 5.2.5 Image Memory.
  • 5.3 HSA Memory Consistency Basics
  • 5.3.1 Background: Sequential Consistency
  • 5.3.2 Background: Conflicts and Races
  • 5.3.3 The HSA Memory Model for a Single Memory Scope
  • 5.3.3.1 HSA synchronization operations
  • 5.3.3.2 Transitive synchronization through different addresses
  • 5.3.3.3 Finding a race
  • 5.3.4 HSA Memory Model Using Memory Scopes
  • 5.3.4.1 Scope motivation
  • 5.3.4.2 HSA scopes
  • 5.3.4.3 Using smaller scopes
  • Scope inclusion
  • Scope transitivity
  • 5.3.5 Memory Segments
  • 5.3.6 Putting It All Together: HSA Race Freedom
  • 5.3.6.1 Simplified definition of HSA race freedom
  • 5.3.6.2 General definition of HSA race freedom
  • 5.3.7 Additional Observations and Considerations
  • 5.4 Advanced Consistency in the HSA Memory Model
  • 5.4.1 Relaxed Atomics
  • 5.4.2 Ownership and Scope Bounding
  • 5.5 Conclusions
  • References
  • Chapter 6: HSA Queuing Model
  • 6.1 Introduction
  • 6.2 User Mode Queues
  • 6.3 Architected Queuing Language
  • 6.3.1 Packet Types
  • 6.3.2 Building Packets
  • 6.4 Packet Submission and Scheduling
  • 6.5 Conclusions
  • References
  • Chapter 7: Compiler Technology
  • 7.1 Introduction
  • 7.2 A Brief Introduction to C + + AMP
  • 7.2.1 C++ AMP array_view
  • 7.2.2 C++ AMP parallel_for_each, or Kernel Invocation
  • 7.2.2.1 Lambdas or functors as kernels
  • 7.2.2.2 Captured variables as kernel arguments
  • 7.2.2.3 The restrict(amp) modifier
  • 7.3 HSA as a Compiler Target
  • 7.4 Mapping Key C++ AMP Constructs to HSA
  • 7.5 C++ AMP Compilation Flow
  • 7.6 Compiled C++ AMP Code
  • 7.7 Compiler Support for Tiling in C++ AMP
  • 7.7.1 Dividing Compute Domain
  • 7.7.2 Specifying Address Space and Barriers
  • 7.8 Memory Segment Annotation
  • 7.9 Towards Generic C++ for HSA
  • 7.10 Compiler Support for Platform Atomics
  • 7.10.1 One Simple Example of Platform Atomics
  • 7.11 Compiler Support for New/Delete Operators.
  • 7.11.1 Implementing New/Delete Operators with Platform Atomics
  • 7.11.2 Promoting New/Delete Returned Address to Global Memory Segment
  • 7.11.3 Improve New/Delete Operators Based on Wait API/Signal HSAIL Instruction
  • 7.12 Conclusion
  • References
  • Chapter 8: Application Use Cases
  • Platform Atomics
  • 8.1 Introduction
  • 8.2 Atomics in HSA
  • 8.3 Task Queue System
  • 8.3.1 Static Execution
  • 8.3.2 Dynamic Execution
  • 8.3.3 HSA Task Queue System
  • 8.3.3.1 A legacy task queue system on GPU
  • 8.3.3.2 A simpler, more intuitive implementation with HSA features
  • 8.3.4 Evaluation
  • 8.3.4.1 An experiment with synthetic input data
  • 8.3.4.2 A real-world application experiment: histogram computation
  • 8.4 Breadth-First Search
  • 8.4.1 Legacy Implementation
  • 8.4.2 HSA Implementation
  • 8.4.3 Evaluation
  • 8.5 Data Layout Conversion
  • 8.5.1 In-place SoA-ASTA Conversion with PTTWAC Algorithm
  • 8.5.2 An HSA Implementation of PTTWAC
  • 8.5.3 Evaluation
  • 8.6 Conclusions
  • Acknowledgment
  • References
  • Chapter 9: HSA Simulators
  • 9.1 Simulating HSA in Multi2Sim
  • 9.1.1 Introduction
  • 9.1.2 Multi2Sim - HSA
  • 9.1.3 HSAIL Host HSA
  • 9.1.3.1 Program entry
  • 9.1.3.2 HSA runtime interception
  • 9.1.3.3 Basic I/O support
  • 9.1.4 HSA Runtime
  • 9.1.5 Emulator Design
  • 9.1.5.1 Emulator hierarchy
  • 9.1.5.2 Memory systems
  • 9.1.6 Logging and Debugging
  • 9.1.7 Multi2Sim - HSA Road Map
  • 9.1.8 Installation and Support
  • 9.2 Emulating HSA with HSA emu
  • 9.2.1 Introduction
  • 9.2.2 Modeled HSA Components
  • 9.2.3 Design of HSA emu
  • 9.2.4 Multithreaded HSA GPU Emulator
  • 9.2.4.1 HSA agent and packet processor
  • 9.2.4.2 Code cache
  • 9.2.4.3 HSA kernel agent and work scheduling
  • 9.2.4.4 Compute unit
  • 9.2.4.5 Soft- MMU and soft- TLB
  • 9.2.5 Profiling, Debugging and Performance Models
  • 9.3 S oft HSA Simulator.
  • 9.3.1 Introduction
  • 9.3.2 High-Level Design
  • 9.3.3 Building and Testing the Simulator
  • 9.3.4 Debugging with the LLVM HSA Simulator
  • References
  • Index
  • Back Cover.