Heterogeneous system architecture a new compute platform infrastructure

Heterogeneous Systems Architecture - a new compute platform infrastructure presents a next-generation hardware platform, and associated software, that allows processors of different types to work efficiently and cooperatively in shared memory from a single source program. HSA also defines a virtual...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Hwu, Wen-mei, author (author), Hwu, Wen-mei W., editor (editor)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Amsterdam, [Netherlands] : Morgan Kaufmann 2016.
Edición:	First edition
Materias:	Heterogeneous computing. Computer architecture.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629688606719

Tabla de Contenidos:

Front Cover
Heterogeneous System Architecture: A New Compute Platform Infrastructure
Copyright
Contents
Foreword
Preface
About the Contributing Authors
Chapter 1: Introduction
Chapter 2: HSA Overview
2.1 A Short History of GPU Computing: The Problems That Are Solved by HSA
2.2 The Pillars of HSA
2.2.1 HSA Memory Model
2.2.2 HSA Queuing Model
2.2.3 HSAIL Virtual ISA
2.2.4 HSA Context Switching
2.3 The HSA Specifications
2.3.1 HSA Platform System Architecture Specification
2.3.2 HSA Runtime Specification
2.3.3 HSA Programmer's Reference Manual -a.k.a. "HSAIL Spec"
2.4 HSA Software
2.5 The HSA Foundation
2.6 Summary
Chapter 3: HSAIL - Virtual Parallel ISA
3.1 Introduction
3.2 Sample Compilation Flow
3.3 HSAIL Execution Model
3.4 A Tour of the HSAIL Instruction Set
3.4.1 Atomic Operations
3.4.2 Registers
3.4.3 Segments
3.4.4 Wavefronts and Lanes
3.5 HSAIL Machine Models and Profiles
3.6 HSAIL Compilation Flow
3.7 HSAIL Compilation Tools
3.7.1 Compiler Frameworks
3.7.2 CL Offline Compilation (CLOC)
3.7.3 HSAIL Assembler/Disassembler
3.7.4 ISA and Machine Code Assembler/Disassembler
3.8 Conclusion
Chapter 4: HSA Runtime
4.1 Introduction
4.2 The HSA Core Runtime API
4.2.1 Runtime Initialization and Shutdown
4.2.2 Runtime Notifications
4.2.3 System and HSA Agent Information
4.2.4 Signals
4.2.5 Queues
4.2.6 Architected Queuing Language
4.2.7 Memory
4.2.8 Code Objects and Executables
4.3 HSA Runtime Extensions
4.3.1 HSAIL Finalization
4.3.2 Images and Samplers
4.4 Conclusion
References
Chapter 5: HSA Memory Model
5.1 Introduction
5.2 HSA Memory Structure
5.2.1 Segments
5.2.2 Flat Addressing
5.2.3 Shared Virtual Addressing
5.2.4 Ownership
5.2.5 Image Memory.
5.3 HSA Memory Consistency Basics
5.3.1 Background: Sequential Consistency
5.3.2 Background: Conflicts and Races
5.3.3 The HSA Memory Model for a Single Memory Scope
5.3.3.1 HSA synchronization operations
5.3.3.2 Transitive synchronization through different addresses
5.3.3.3 Finding a race
5.3.4 HSA Memory Model Using Memory Scopes
5.3.4.1 Scope motivation
5.3.4.2 HSA scopes
5.3.4.3 Using smaller scopes
Scope inclusion
Scope transitivity
5.3.5 Memory Segments
5.3.6 Putting It All Together: HSA Race Freedom
5.3.6.1 Simplified definition of HSA race freedom
5.3.6.2 General definition of HSA race freedom
5.3.7 Additional Observations and Considerations
5.4 Advanced Consistency in the HSA Memory Model
5.4.1 Relaxed Atomics
5.4.2 Ownership and Scope Bounding
5.5 Conclusions
References
Chapter 6: HSA Queuing Model
6.1 Introduction
6.2 User Mode Queues
6.3 Architected Queuing Language
6.3.1 Packet Types
6.3.2 Building Packets
6.4 Packet Submission and Scheduling
6.5 Conclusions
References
Chapter 7: Compiler Technology
7.1 Introduction
7.2 A Brief Introduction to C + + AMP
7.2.1 C++ AMP array_view
7.2.2 C++ AMP parallel_for_each, or Kernel Invocation
7.2.2.1 Lambdas or functors as kernels
7.2.2.2 Captured variables as kernel arguments
7.2.2.3 The restrict(amp) modifier
7.3 HSA as a Compiler Target
7.4 Mapping Key C++ AMP Constructs to HSA
7.5 C++ AMP Compilation Flow
7.6 Compiled C++ AMP Code
7.7 Compiler Support for Tiling in C++ AMP
7.7.1 Dividing Compute Domain
7.7.2 Specifying Address Space and Barriers
7.8 Memory Segment Annotation
7.9 Towards Generic C++ for HSA
7.10 Compiler Support for Platform Atomics
7.10.1 One Simple Example of Platform Atomics
7.11 Compiler Support for New/Delete Operators.
7.11.1 Implementing New/Delete Operators with Platform Atomics
7.11.2 Promoting New/Delete Returned Address to Global Memory Segment
7.11.3 Improve New/Delete Operators Based on Wait API/Signal HSAIL Instruction
7.12 Conclusion
References
Chapter 8: Application Use Cases
Platform Atomics
8.1 Introduction
8.2 Atomics in HSA
8.3 Task Queue System
8.3.1 Static Execution
8.3.2 Dynamic Execution
8.3.3 HSA Task Queue System
8.3.3.1 A legacy task queue system on GPU
8.3.3.2 A simpler, more intuitive implementation with HSA features
8.3.4 Evaluation
8.3.4.1 An experiment with synthetic input data
8.3.4.2 A real-world application experiment: histogram computation
8.4 Breadth-First Search
8.4.1 Legacy Implementation
8.4.2 HSA Implementation
8.4.3 Evaluation
8.5 Data Layout Conversion
8.5.1 In-place SoA-ASTA Conversion with PTTWAC Algorithm
8.5.2 An HSA Implementation of PTTWAC
8.5.3 Evaluation
8.6 Conclusions
Acknowledgment
References
Chapter 9: HSA Simulators
9.1 Simulating HSA in Multi2Sim
9.1.1 Introduction
9.1.2 Multi2Sim - HSA
9.1.3 HSAIL Host HSA
9.1.3.1 Program entry
9.1.3.2 HSA runtime interception
9.1.3.3 Basic I/O support
9.1.4 HSA Runtime
9.1.5 Emulator Design
9.1.5.1 Emulator hierarchy
9.1.5.2 Memory systems
9.1.6 Logging and Debugging
9.1.7 Multi2Sim - HSA Road Map
9.1.8 Installation and Support
9.2 Emulating HSA with HSA emu
9.2.1 Introduction
9.2.2 Modeled HSA Components
9.2.3 Design of HSA emu
9.2.4 Multithreaded HSA GPU Emulator
9.2.4.1 HSA agent and packet processor
9.2.4.2 Code cache
9.2.4.3 HSA kernel agent and work scheduling
9.2.4.4 Compute unit
9.2.4.5 Soft- MMU and soft- TLB
9.2.5 Profiling, Debugging and Performance Models
9.3 S oft HSA Simulator.
9.3.1 Introduction
9.3.2 High-Level Design
9.3.3 Building and Testing the Simulator
9.3.4 Debugging with the LLVM HSA Simulator
References
Index
Back Cover.

Heterogeneous system architecture a new compute platform infrastructure

Ejemplares similares