Embedded computing for high performance design exploration and customization using high-level compilation and synthesis tools
Embedded Computing for High Performance: Design Exploration and Customization Using High-level Compilation and Synthesis Tools provides a set of real-life example implementations that migrate traditional desktop systems to embedded systems. Working with popular hardware, including Xilinx and ARM, th...
Otros Autores: | , , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Cambridge, Massachusetts :
Morgan Kaufmann
2017.
|
Edición: | Second edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630068406719 |
Tabla de Contenidos:
- Front Cover
- Embedded Computing for High Performance: Efficient Mapping of Computations Using Customization, CodeTransformations and Com...
- Copyright
- Dedication
- Contents
- About the Authors
- Preface
- Acknowledgments
- Abbreviations
- Chapter 1: Introduction
- 1.1. Overview
- 1.2. Embedded Systems in Society and Industry
- 1.3. Embedded Computing Trends
- 1.4. Embedded Systems: Prototyping and Production
- 1.5. About LARA: An Aspect-Oriented Approach
- 1.6. Objectives and Target Audience
- 1.7. Complementary Bibliography
- 1.8. Dependences in Terms of Knowledge
- 1.9. Examples and Benchmarks
- 1.10. Book Organization
- 1.11. Intended Use
- 1.12. Summary
- References
- Chapter 2: High-performance embedded computing
- 2.1. Introduction
- 2.2. Target Architectures
- 2.2.1. Hardware Accelerators as Coprocessors
- 2.2.2. Multiprocessor and Multicore Architectures
- 2.2.3. Heterogeneous Multiprocessor/Multicore Architectures
- 2.2.4. OpenCL Platform Model
- 2.3. Core-Based Architectural Enhancements
- 2.3.1. Single Instruction, Multiple Data Units
- 2.3.2. Fused Multiply-Add Units
- 2.3.3. Multithreading Support
- 2.4. Common Hardware Accelerators
- 2.4.1. GPU Accelerators
- 2.4.2. Reconfigurable Hardware Accelerators
- 2.4.3. SoCs With Reconfigurable Hardware
- 2.5. Performance
- 2.5.1. Amdahl's Law
- 2.5.2. The Roofline Model
- 2.5.3. Worst-Case Execution Time Analysis
- 2.6. Power and Energy Consumption
- 2.6.1. Dynamic Power Management
- 2.6.2. Dynamic Voltage and Frequency Scaling
- 2.6.3. Dark Silicon
- 2.7. Comparing Results
- 2.8. Summary
- 2.9. Further Reading
- References
- Chapter 3: Controlling the design and development cycle
- 3.1. Introduction
- 3.2. Specifications in MATLAB and C: Prototyping and Development
- 3.2.1. Abstraction Levels
- 3.2.2. Dealing With Different Concerns.
- 3.2.3. Dealing With Generic Code
- 3.2.4. Dealing With Multiple Targets
- 3.3. Translation, Compilation, and Synthesis Design flows
- 3.4. Hardware/Software Partitioning
- 3.4.1. Static Partitioning
- 3.4.2. Dynamic Partitioning
- 3.5. LARA: a language for Specifying Strategies
- 3.5.1. Select and Apply
- 3.5.2. Insert Action
- 3.5.3. Exec and Def Actions
- 3.5.4. Invoking Aspects
- 3.5.5. Executing External Tools
- 3.5.6. Compilation and Synthesis Strategies in LARA
- 3.6. Summary
- 3.7. Further Reading
- References
- Chapter 4: Source code analysis and instrumentation
- 4.1. Introduction
- 4.2. Analysis and Metrics
- 4.3. Static Source Code Analysis
- 4.3.1. Data Dependences
- 4.3.2. Code Metrics
- 4.4. Dynamic Analysis: The Need for Instrumentation
- 4.4.1. Information From Profiling
- 4.4.2. Profiling Example
- 4.5. Custom Profiling Examples
- 4.5.1. Finding Hotspots
- 4.5.2. Loop Metrics
- 4.5.3. Dynamic Call Graphs
- 4.5.4. Branch Frequencies
- 4.5.5. Heap Memory
- 4.6. Summary
- 4.7. Further Reading
- References
- Chapter 5: Source code transformations and optimizations
- 5.1. Introduction
- 5.2. Basic Transformations
- 5.3. Data Type Conversions
- 5.4. Code Reordering
- 5.5. Data Reuse
- 5.6. Loop-Based Transformations
- 5.6.1. Loop Alignment
- 5.6.2. Loop Coalescing
- 5.6.3. Loop Flattening
- 5.6.4. Loop Fusion and Loop Fission
- 5.6.5. Loop Interchange and Loop Permutation (Loop Reordering)
- 5.6.6. Loop Peeling
- 5.6.7. Loop Shifting
- 5.6.8. Loop Skewing
- 5.6.9. Loop Splitting
- 5.6.10. Loop Stripmining
- 5.6.11. Loop Tiling (Loop Blocking)
- 5.6.12. Loop Unrolling
- 5.6.13. Unroll and Jam
- 5.6.14. Loop Unswitching
- 5.6.15. Loop Versioning
- 5.6.16. Software Pipelining
- 5.6.17. Evaluator-Executor Transformation
- 5.6.18. Loop Perforation
- 5.6.19. Other Loop Transformations.
- 5.6.20. Overview
- 5.7. Function-Based Transformations
- 5.7.1. Function Inlining/Outlining
- 5.7.2. Partial Evaluation and Code Specialization
- 5.7.3. Function Approximation
- 5.8. Data structure-Based Transformations
- 5.8.1. Scalar Expansion, Array Contraction, and Array Scalarization
- 5.8.2. Scalar and Array Renaming
- 5.8.3. Arrays and Records
- 5.8.4. Reducing the Number of Dimensions of Arrays
- 5.8.5. From Arrays to Pointers and Array Recovery
- 5.8.6. Array Padding
- 5.8.7. Representation of Matrices and Graphs
- 5.8.8. Object Inlining
- 5.8.9. Data Layout Transformations
- 5.8.10. Data Replication and Data Distribution
- 5.9. From Recursion to Iterations
- 5.10. From Nonstreaming to Streaming
- 5.11. Data and Computation Partitioning
- 5.11.1. Data Partitioning
- 5.11.2. Partitioning Computations
- 5.11.3. Computation Offloading
- 5.12. LARA Strategies
- 5.13. Summary
- 5.14. Further Reading
- References
- Chapter 6: Code retargeting for CPU-based platforms
- 6.1. Introduction
- 6.2. Retargeting Mechanisms
- 6.3. Parallelism and Compiler Options
- 6.3.1. Parallel Execution Opportunities
- 6.3.2. Compiler Options
- 6.3.3. Compiler Phase Selection and Ordering
- 6.4. Loop Vectorization
- 6.5. Shared Memory (Multicore)
- 6.6. Distributed Memory (Multiprocessor)
- 6.7. Cache-based Program Optimizations
- 6.8. LARA Strategies
- 6.8.1. Capturing Heuristics to Control Code Transformations
- 6.8.2. Parallelizing Code With OpenMP
- 6.8.3. Monitoring an MPI Application
- 6.9. Summary
- 6.10. Further Reading
- References
- Chapter 7: Targeting heterogeneous computing platforms
- 7.1. Introduction
- 7.2. Roofline Model Revisited
- 7.3. Workload Distribution
- 7.4. Graphics Processing Units
- 7.5. High-level Synthesis
- 7.6. LARA Strategies
- 7.7. Summary
- 7.8. Further Reading
- References.
- Chapter 8: Additional topics
- 8.1. Introduction
- 8.2. Design Space Exploration
- 8.2.1. Single-Objective Optimization and Single/Multiple Criteria
- 8.2.2. Multiobjective Optimization, Pareto Optimal Solutions
- 8.2.3. DSE Automation
- 8.3. Hardware/Software Codesign
- 8.4. Runtime Adaptability
- 8.4.1. Tuning Application Parameters
- 8.4.2. Adaptive Algorithms
- 8.4.3. Resource Adaptivity
- 8.5. Automatic Tuning (Autotuning)
- 8.5.1. Search Space
- 8.5.2. Static and Dynamic Autotuning
- 8.5.3. Models for Autotuning
- 8.5.4. Autotuning Without Dynamic Compilation
- 8.5.5. Autotuning With Dynamic Compilation
- 8.6. Using LARA for Exploration of Code Transformation Strategies
- 8.7. Summary
- 8.8. Further Reading
- References
- Glossary
- Index
- Back Cover.