Data Cleaning with Power BI The Definitive Guide to Transforming Dirty Data into Actionable Insights

Microsoft Power BI offers a range of powerful data cleaning and preparation options through tools such as DAX, Power Query, and the M language. However, despite its user-friendly interface, mastering it can be challenging. Whether you're a seasoned analyst or a novice exploring the potential of...

Full description

Bibliographic Details
Other Authors: Frazer, Gus, (author) (author)
Format: eBook
Language:Inglés
Published: Birmingham, England : Packt Publishing [2024]
Edition:First edition
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009805122906719
Table of Contents:
  • Cover
  • Title Page
  • Copyright and Credits
  • Dedication
  • Contributors
  • Table of Contents
  • Preface
  • Part 1 - Introduction and Fundamentals
  • Chapter 1: Introduction to Power BI Data Cleaning
  • Technical requirements
  • Cleaning your data in Power BI
  • Understanding Power Query
  • Understanding DAX
  • Where do we begin with data?
  • Summary
  • Questions
  • Chapter 2: Understanding Data Quality and Why Data Cleaning is Important
  • What is data quality?
  • Where do data quality issues come from?
  • The role of data cleaning in improving data quality
  • Data integrity and accuracy
  • Decision-making and business outcomes
  • Data ownership and accountability
  • A holistic view of the data ecosystem
  • Early detection of issues
  • Continuous improvement and learning
  • Empowerment and collaboration
  • Best practices for data quality overall
  • Establishing data quality standards
  • Summary
  • Questions
  • Chapter 3: Data Cleaning Fundamentals and Principles
  • Defining data cleaning
  • Who's responsible for cleaning data?
  • Building a process for cleaning data
  • Data assessment
  • Data profiling
  • Data validation
  • Data cleaning strategies
  • Data transformations
  • Data quality assurance
  • Documentation
  • Understanding quality over quantity in data cleaning
  • Summary
  • Questions
  • Chapter 4: The Most Common Data Cleaning Operations
  • Technical requirements
  • Removing duplicates
  • Removing missing data
  • Splitting columns
  • Merging columns
  • Replacing values
  • Creating calculated columns versus measures
  • Calculated columns
  • Measures
  • Calculation group
  • Considerations
  • Summary
  • Questions
  • Part 2 - Data Import and Query Editor
  • Chapter 5: Importing Data into Power BI
  • Technical requirements
  • Understanding data completeness
  • Understanding data accuracy
  • Understanding data consistency.
  • Assessing data relevance
  • Assessing data formatting
  • Assessing data normalization, denormalization, and star schemas
  • Dimension modeling and star schema
  • Denormalized data in dimension tables
  • Summary
  • Questions
  • Chapter 6: Cleaning Data with Query Editor
  • Technical requirements
  • Understanding the Query Editor interface
  • Data cleaning techniques and functions
  • Adding columns
  • Data type conversions
  • Date/time
  • Rounding
  • Pivot/unpivot columns
  • Merge queries
  • Using Query Editor versus DAX for transformation
  • Power Query Editor
  • Data Analysis Expressions (DAX)
  • Workflow
  • Summary
  • Questions
  • Further reading
  • Chapter 7: Transforming Data with the M Language
  • Technical requirements
  • Understanding the M language
  • Structure of M
  • Common use cases of M
  • Filtering and sorting data with M
  • Transforming data with M
  • Working with data sources in M
  • Creating parameters and variables
  • Summary
  • Questions
  • Chapter 8: Using Data Profiling for Exploratory Data Analysis (EDA)
  • Understanding EDA
  • Exploring data profiling features in Power BI
  • Reviewing column quality, distribution, and profile
  • Column distribution
  • Column quality
  • Column profile
  • Turning data profiles into high-quality data
  • Recommended actions on column distribution
  • Value distribution
  • Summary
  • Questions
  • Part 3 - Advanced Data Cleaning and Optimizations
  • Chapter 9: Advanced Data Cleaning Techniques
  • Technical requirements
  • Using Power Query Editor from within Dataflow Gen1 - fuzzy matching and fill down
  • Fuzzy matching
  • Fill down
  • Best practices for using fuzzy matching and fill down
  • Using R and Python scripts
  • Benefits of using R or Python scripts
  • Getting started with using R or Python scripts in Power BI
  • Using ML to clean data
  • Data cleaning with anomaly detection.
  • Data preparation with AutoML
  • Data enhancement with AI Insights
  • Summary
  • Questions
  • Chapter 10: Creating Custom Functions in Power Query
  • Planning for your custom function
  • Defining the problem
  • Identifying parameters
  • Setting clear objectives
  • Using parameters
  • Types of parameters
  • Defining parameters
  • Best practices for using parameters
  • Creating custom functions
  • Defining the function structure
  • Writing M code
  • Testing and debugging
  • Documentation
  • Summary
  • Questions
  • Chapter 11: M Query Optimization
  • Technical requirements
  • Creating custom functions
  • Filtering and reducing data
  • Using native M functions
  • Optimizing memory usage
  • Parallel query execution
  • Using Table.Buffer and Table.Split
  • Summary
  • Questions
  • Further reading
  • Chapter 12: Data Modeling and Managing Relationships
  • Understanding the basics of data modeling
  • Importing versus DirectQuery
  • Dimensional modeling
  • Snowflake schema
  • Intermediate tables
  • Calendars and date tables
  • Role-playing dimensions
  • Aggregating tables
  • Incremental refreshes
  • Using bidirectional cross-filtering
  • What is bidirectional cross-filtering?
  • Best practices for bidirectional cross-filtering
  • Understanding what's the right cardinality
  • Understanding cardinality
  • Why cardinality matters
  • Choosing the right cardinality
  • Handling large and complex datasets
  • Understanding big data
  • Challenges of working with big data in Power BI
  • Best practices for handling big data
  • Avoiding circular references
  • Understanding circular references
  • Best practices for avoiding circular references
  • Summary
  • Questions
  • Further reading
  • Part 4 - Paginated Reports, Automations, and OpenAI
  • Chapter 13: Preparing Data for Paginated Reporting
  • Technical requirements
  • Understanding the importance of paginated reports.
  • Connecting to data sources within Power BI Report Builder
  • Data preparation
  • Query
  • Fields
  • Options
  • Filters
  • Parameters
  • Creating a dataset example
  • Using filters and parameters
  • Using row groups/column groups
  • Organizing and structuring data
  • Enhancing readability and presentation
  • Summary
  • Questions
  • Chapter 14: Automating Data Cleaning Tasks with Power Automate
  • Technical requirements
  • Handling triggers for automation
  • Automating notifications
  • Automating refreshing of data
  • Creating snapshots (temporary tables) of cleaned data
  • Best practices with Power Automate
  • Summary
  • Questions
  • Further reading
  • Chapter 15: Making Life Easier with OpenAI
  • Optimizing efficiency with OpenAI, ChatGPT, and DAX
  • Using OpenAI for M queries
  • Using Microsoft Copilot
  • Tackling challenges with AI
  • Summary
  • Questions
  • Further reading
  • Putting it together
  • Assessments
  • Chapter 1 - Introduction to Power BI Data Cleaning
  • Chapter 2 - Understanding Data Quality and Why Data Cleaning is Important
  • Chapter 3 - Data Cleaning Fundamentals and Principles
  • Chapter 4 - The Most Common Data Cleaning Operations
  • Chapter 5 - Importing Data into Power BI
  • Chapter 6 - Cleaning Data with Query Editor
  • Chapter 7 - Transforming Data with the M Language
  • Chapter 8 - Using Data Profiling for Exploratory Data Analysis (EDA)
  • Chapter 9 - Advanced Data Cleaning Techniques
  • Chapter 10 - Creating Custom Functions in Power Query
  • Chapter 11 - M Query Optimization
  • Chapter 12 - Data Modeling and Managing Relationships
  • Chapter 13 - Preparing Data for Paginated Reporting
  • Chapter 14 - Automating Data Cleaning Tasks with Power Automate
  • Chapter 15 - Making Life Easier with OpenAI
  • Index
  • About Packt
  • Other Books You May Enjoy.