Fundamentals of Analytics Engineering An Introduction to Building End-To-end Analytics Solutions

Gain a holistic understanding of the analytics engineering lifecycle by integrating principles from both data analysis and engineering Key Features Discover how analytics engineering aligns with your organization's data strategy Access insights shared by a team of seven industry experts Tackle...

Descripción completa

Detalles Bibliográficos
Otros Autores: Wilde, Dumky De, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt Publishing [2024]
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009810644306719
Tabla de Contenidos:
  • Cover
  • Title Page
  • Copyright and Credits
  • Dedications
  • Foreword
  • Contributors
  • Table of Contents
  • Preface
  • Prologue
  • Part 1: Introduction to Analytics Engineering
  • Chapter 1: What Is Analytics Engineering?
  • Introducing analytics engineering
  • Defining analytics engineering
  • Why do we need analytics engineering?
  • A supermarket analogy
  • The shift from ETL to ELT
  • The difference between analytics engineers, data analysts, and data engineers
  • Summary
  • Chapter 2: The Modern Data Stack
  • Understanding a Modern Data Stack
  • Explaining three key differentiators versus legacy stacks
  • Lowering technical barriers with a SQL-first approach
  • Improving infrastructure efficiency with cloud-native systems
  • Simplifying implementation and maintenance with managed and modular solutions
  • Discussing the advantages and disadvantages of the MDS
  • Summary
  • Part 2: Building Data Pipelines
  • Chapter 3: Data Ingestion
  • Digging into the problem of moving data between two systems
  • The source of all problems
  • Understanding the eight essential steps of a data ingestion pipeline
  • Trigger
  • Connection
  • State management
  • Data extraction
  • Transformations
  • Validation and data quality
  • Loading
  • Archiving and retention
  • Managing the quality and scalability of data ingestion pipelines - the three key topics
  • Scalability and resilience
  • Monitoring, logging, and alerting
  • Governance
  • Working with data ingestion - an example pipeline
  • Summary
  • Chapter 4: Data Warehousing
  • Uncovering the evolution of data warehousing
  • The problem with transactional databases
  • The history of data warehouses
  • Moving to the cloud
  • Benefits of cloud versus on-premises data warehouses
  • Cloud data warehouse users - no one-size fits all
  • Building blocks of a cloud data warehouse
  • Compute.
  • Knowing the market leaders in cloud data warehousing
  • Amazon Redshift
  • Google BigQuery
  • Snowflake
  • Databricks
  • Use case - choosing the right cloud data warehouse
  • Managed versus self-hosted data warehouses
  • Summary
  • Chapter 5: Data Modeling
  • The importance of data models
  • Completeness
  • Enforcement of business rules
  • Minimizing redundancy
  • Data reusability
  • Stability and flexibility
  • Elegance
  • Communication
  • Integration
  • Potential trade-offs
  • The elephant in the room - performance
  • Designing your data model
  • Data modeling techniques
  • Bill Inmon and relational modeling
  • Ralph Kimball and dimensional modeling
  • Daniel Linstedt and Data Vault
  • Comparison of the different data models
  • Choosing a data model
  • Summary
  • Chapter 6: Transforming Data
  • Transforming data - the foundation of analytics work
  • A key step in the data value chain
  • Challenges in transforming data
  • Design choices
  • Where to apply transformations
  • Specify your data model
  • Layering transformations
  • Data transformation best practices
  • Readability and reusability first, optimization second
  • Modularity
  • Other best practices
  • An example of writing modular code
  • Tools that facilitate data transformations
  • Types of transformation tools
  • Considerations
  • Summary
  • Chapter 7: Serving Data
  • Exposing data using dashboarding and BI tools
  • Dashboards
  • Spreadsheets
  • Programming environments
  • Low-code tools
  • Reverse ETL
  • Valuable
  • Usable
  • Sensible
  • Serving data - four key topics
  • Self-serving analytics and report factories
  • Interactive and static reports
  • Actionable and vanity metrics
  • Reusability and bespoke processes
  • Summary
  • Part 3: Hands-On Guide to Building a Data Platform
  • Chapter 8: Hands-On Analytics Engineering
  • Technical requirements.
  • Understanding the Stroopwafelshop use case
  • Business objectives, metrics, and KPIs
  • Looking at the data
  • The thing about spreadsheets
  • What about BI tools?
  • The tooling
  • Preparing Google Cloud
  • ELT using Airbyte Cloud
  • Loading the Stroopwafelshop data using Airbyte Cloud
  • Modeling data using dbt Cloud
  • The shortcomings of conventional analytics
  • The role of dbt in analytics engineering
  • Setting up dbt Cloud
  • Data marts
  • Additional dbt features
  • Visualizing data with Tableau
  • Why Tableau?
  • Selecting the KPIs
  • First visualization
  • Creating measures
  • Creating the store growth dashboard
  • What's next?
  • Summary
  • Part 4: DataOps
  • Chapter 9: Data Quality and Observability
  • Understanding the problem of data quality at the source, in transformations, and in data governance
  • Data quality issues in source systems
  • Data quality issues in data infrastructure and data pipelines
  • How data governance impacts data quality
  • Finding solutions to data quality issues - observability, data catalogs, and semantic layers
  • Using observability to improve your data quality
  • The benefits of data catalogs for data quality
  • Improving data quality with a semantic layer
  • Summary
  • Chapter 10: Writing Code in a Team
  • Identifying the responsibilities of team members
  • Tracking tasks and issues
  • Tools for issue and task tracking
  • Clear task definition
  • Categorization and tagging
  • Managing versions with version control
  • Working with Git
  • Git branching
  • Development workflow for analytics engineers
  • Working with coding standards
  • PEP8
  • ANSI
  • Linters
  • Pre-commit hooks
  • Reviewing code
  • Pull requests - The four eyes principle
  • Continuous integration/continuous deployment
  • Documenting code
  • Documenting code in dbt
  • Code comments
  • READMEs
  • Documentation on getting started.
  • Conceptual documentation
  • Working with containers
  • Refactoring and technical debt
  • Summary
  • Chapter 11: Automating Workflows
  • Introducing DataOps
  • Orchestrating data pipelines
  • Designing an automated workflow - considerations
  • dbt Cloud
  • Airflow
  • Continuous integration
  • Integration
  • Continuous
  • Handling integration issues
  • Automating testing with a CI pipeline
  • Continuous deployment
  • The CD pipeline
  • Slim CI/CD
  • Configuring CI/CD in dbt Cloud
  • Continuous delivery
  • Continuous delivery versus continuous deployment
  • Summary
  • Part 5: Data Strategy
  • Chapter 12: Driving Business Adoption
  • Defining analytics translation
  • The analytics value chain
  • Scoping analytics use cases
  • Identifying stakeholders
  • Ideating analytics use cases
  • Prioritizing use cases
  • Ensuring business adoption
  • Working incrementally
  • Gathering feedback
  • Knowing when to stop developing
  • Communicating your results
  • Documenting business logic
  • Summary
  • Chapter 13: Data Governance
  • Understanding data governance
  • The objective of data governance
  • Applying data governance in analytics engineering
  • Defining data ownership
  • Data quality and integrity
  • Managing data assets
  • Training, enablement, and best practices
  • Data definitions
  • Addressing critical areas for seamless data governance
  • Resistance to change and adoption
  • Engaging stakeholders and fostering collaboration
  • Establishing a data governance roadmap
  • Summary
  • Chapter 14: Epilogue
  • Reviewing the fundamental insights - what you've learned so far
  • Making your career future-proof - how to take it further
  • Tip #1 - keep learning and developing your skills
  • Tip #2 - network and engage with the community
  • Tip #3 - showcase your work and build a portfolio
  • Closing remarks
  • Index
  • Other Books You May Enjoy.