Algorithms and parallel computing
"There is a software gap between the hardware potential and the performance that can be attained using today's software parallel program development tools. The tools need manual intervention by the programmer to parallelize the code. Programming a parallel computer requires closely studyin...
Autor principal: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Hoboken, N.J. :
Wiley
2011.
|
Edición: | 1st edition |
Colección: | Wiley series on parallel and distributed computing ;
82. |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009628991006719 |
Tabla de Contenidos:
- Algorithms and Parallel Computing; Contents; Preface; List of Acronyms; Chapter 1: Introduction; 1.1 INTRODUCTION; 1.2 TOWARD AUTOMATING PARALLEL PROGRAMMING; 1.3 ALGORITHMS; 1.4 PARALLEL COMPUTING DESIGN CONSIDERATIONS; 1.5 PARALLEL ALGORITHMS AND PARALLEL ARCHITECTURES; 1.6 RELATING PARALLEL ALGORITHM AND PARALLEL ARCHITECTURE; 1.7 IMPLEMENTATION OF ALGORITHMS: A TWO-SIDED PROBLEM; 1.8 MEASURING BENEFITS OF PARALLEL COMPUTING; 1.9 AMDAHL'S LAW FOR MULTIPROCESSOR SYSTEMS; 1.10 GUSTAFSON-BARSIS'S LAW; 1.11 APPLICATIONS OF PARALLEL COMPUTING; Chapter 2: Enhancing Uniprocessor Performance
- 2.1 INTRODUCTION2.2 INCREASING PROCESSOR CLOCK FREQUENCY; 2.3 PARALLELIZING ALU STRUCTURE; 2.4 USING MEMORY HIERARCHY; 2.5 PIPELINING; 2.6 VERY LONG INSTRUCTION WORD (VLIW) PROCESSORS; 2.7 INSTRUCTION-LEVEL PARALLELISM (ILP) AND SUPERSCALAR PROCESSORS; 2.8 MULTITHREADED PROCESSOR; Chapter 3: Parallel Computers; 3.1 INTRODUCTION; 3.2 PARALLEL COMPUTING; 3.3 SHARED-MEMORY MULTIPROCESSORS (UNIFORM MEMORY ACCESS [UMA]); 3.4 DISTRIBUTED-MEMORY MULTIPROCESSOR (NONUNIFORM MEMORY ACCESS [NUMA]); 3.5 SIMD PROCESSORS; 3.6 SYSTOLIC PROCESSORS; 3.7 CLUSTER COMPUTING; 3.8 GRID (CLOUD) COMPUTING
- 3.9 MULTICORE SYSTEMS3.10 SM; 3.11 COMMUNICATION BETWEEN PARALLEL PROCESSORS; 3.12 SUMMARY OF PARALLEL ARCHITECTURES; Chapter 4: Shared-Memory Multiprocessors; 4.1 INTRODUCTION; 4.2 CACHE COHERENCE AND MEMORY CONSISTENCY; 4.3 SYNCHRONIZATION AND MUTUAL EXCLUSION; Chapter 5: Interconnection Networks; 5.1 INTRODUCTION; 5.2 CLASSIFICATION OF INTERCONNECTION NETWORKS BY LOGICAL TOPOLOGIES; 5.3 INTERCONNECTION NETWORK SWITCH ARCHITECTURE; Chapter 6: Concurrency Platforms; 6.1 INTRODUCTION; 6.2 CONCURRENCY PLATFORMS; 6.3 CILK++; 6.4 OpenMP; 6.5 COMPUTE UNIFIED DEVICE ARCHITECTURE (CUDA)
- Chapter 7: Ad Hoc Techniques for Parallel Algorithms7.1 INTRODUCTION; 7.2 DEFINING ALGORITHM VARIABLES; 7.3 INDEPENDENT LOOP SCHEDULING; 7.4 DEPENDENT LOOPS; 7.5 LOOP SPREADING FOR SIMPLE DEPENDENT LOOPS; 7.6 LOOP UNROLLING; 7.7 PROBLEM PARTITIONING; 7.8 DIVIDE-AND-CONQUER (RECURSIVE PARTITIONING) STRATEGIES; 7.9 PIPELINING; Chapter 8: Nonserial-Parallel Algorithms; 8.1 INTRODUCTION; 8.2 COMPARING DAG AND DCG ALGORITHMS; 8.3 PARALLELIZING NSPA ALGORITHMS REPRESENTED BY A DAG; 8.4 FORMAL TECHNIQUE FOR ANALYZING NSPAs; 8.5 DETECTING CYCLES IN THE ALGORITHM
- 8.6 EXTRACTING SERIAL AND PARALLEL ALGORITHM PERFORMANCE PARAMETERS8.7 USEFUL THEOREMS; 8.8 PERFORMANCE OF SERIAL AND PARALLEL ALGORITHMS ON PARALLEL COMPUTERS; Chapter 9: z-Transform Analysis; 9.1 INTRODUCTION; 9.2 DEFINITION OF z-TRANSFORM; 9.3 THE 1-D FIR DIGITAL FILTER ALGORITHM; 9.4 SOFTWARE AND HARDWARE IMPLEMENTATIONS OF THE z-TRANSFORM; 9.5 DESIGN 1: USING HORNER'S RULE FOR BROADCAST INPUT AND PIPELINED OUTPUT; 9.6 DESIGN 2: PIPELINED INPUT AND BROADCAST OUTPUT; 9.7 DESIGN 3: PIPELINED INPUT AND OUTPUT; Chapter 10: Dependence Graph Analysis; 10.1 INTRODUCTION
- 10.2 THE 1-D FIR DIGITAL FILTER ALGORITHM