High performance parallelism pearls Volume two : multicore and many-core programming approaches Volume two :

High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Reinders, James, author (author), Jeffers, Jim, author (contributor), Amstutz, Jefferson, contributor
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Amsterdam, [Netherlands] : Morgan Kaufmann 2015.
Edición:	First edition
Materias:	Parallel programming (Computer science) > Data processing. Coprocessors. Computer programming.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629585806719

Tabla de Contenidos:

Front Cover; High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches; Copyright; Contents; Contributors; Acknowledgments; Foreword; Making a bet on many-core; 2013 Stampede-Intel Many-Core System - A First; HPC journey and revelation; Stampede users discover: Its parallel programming; This book is timely and important; Preface; Inspired by 61 cores: A new era in programming; Chapter 1: Introduction; Applications and techniques; SIMD and vectorization; OpenMP and nested parallelism; Latency optimizations; Python; Streams; Ray tracing; Tuning prefetching
MPI shared memoryUsing every last core; OpenCL vs. OpenMP; Power analysis for nodes and clusters; The future of many-core; Downloads; For more information; Chapter 2: Numerical Weather Prediction Optimization; Numerical weather prediction: Background and motivation; WSM6 in the NIM; Shared-memory parallelism and controlling horizontal vector length; Array alignment; Loop restructuring; Compile-time constants for loop and array bounds; Performance improvements; Summary; For more information; Chapter 3: WRF Goddard Microphysics Scheme Optimization; The motivation and background
WRF Goddard microphysics schemeGoddard microphysics scheme; Benchmark setup; Code optimization; Removal of the vertical dimension from temporary variables for a reduced memory footprint; Collapse i- and j-loops into smaller cells for smaller footprint per thread; Addition of vector alignment directives; Summary of the code optimizations; Analysis using an instruction Mix report; VTune performance metrics; Performance effects of the optimization of Goddard microphysics scheme on the WRF; Summary; Acknowledgments; For more information; Chapter 4: Pairwise DNA Sequence Alignment Optimization
Pairwise sequence alignmentParallelization on a single coprocessor; Multi-threading using OpenMP; Vectorization using SIMD intrinsics; Parallelization across multiple coprocessors using MPI; Performance results; Summary; For more information; Chapter 5: Accelerated Structural Bioinformatics for Drug Discovery; Parallelism enables proteome-scale structural bioinformatics; Overview of eFindSite; Benchmarking dataset; Code profiling; Porting eFindSite for coprocessor offload; Parallel version for a multicore processor; Task-level scheduling for processor and coprocessor; Case study; Summary
For more informationChapter 6: Amber PME Molecular Dynamics Optimization; Theory of MD; Acceleration of neighbor list building using the coprocessor; Acceleration of direct space sum using the coprocessor; Additional optimizations in coprocessor code; Removing locks whenever possible; Exclusion list optimization; Reduce data transfer and computation in offload code; Modification of load balance algorithm; PME direct space sum and neighbor list work; PME reciprocal space sum work; Bonded force work; Compiler optimization flags; Results; Conclusions; For more information
Chapter 7: Low-Latency Solutions for Financial Services Applications

High performance parallelism pearls Volume two : multicore and many-core programming approaches Volume two :

Ejemplares similares