Data Cleaning with Power BI The Definitive Guide to Transforming Dirty Data into Actionable Insights
Microsoft Power BI offers a range of powerful data cleaning and preparation options through tools such as DAX, Power Query, and the M language. However, despite its user-friendly interface, mastering it can be challenging. Whether you're a seasoned analyst or a novice exploring the potential of...
Other Authors: | |
---|---|
Format: | eBook |
Language: | Inglés |
Published: |
Birmingham, England :
Packt Publishing
[2024]
|
Edition: | First edition |
Subjects: | |
See on Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009805122906719 |
Table of Contents:
- Cover
- Title Page
- Copyright and Credits
- Dedication
- Contributors
- Table of Contents
- Preface
- Part 1 - Introduction and Fundamentals
- Chapter 1: Introduction to Power BI Data Cleaning
- Technical requirements
- Cleaning your data in Power BI
- Understanding Power Query
- Understanding DAX
- Where do we begin with data?
- Summary
- Questions
- Chapter 2: Understanding Data Quality and Why Data Cleaning is Important
- What is data quality?
- Where do data quality issues come from?
- The role of data cleaning in improving data quality
- Data integrity and accuracy
- Decision-making and business outcomes
- Data ownership and accountability
- A holistic view of the data ecosystem
- Early detection of issues
- Continuous improvement and learning
- Empowerment and collaboration
- Best practices for data quality overall
- Establishing data quality standards
- Summary
- Questions
- Chapter 3: Data Cleaning Fundamentals and Principles
- Defining data cleaning
- Who's responsible for cleaning data?
- Building a process for cleaning data
- Data assessment
- Data profiling
- Data validation
- Data cleaning strategies
- Data transformations
- Data quality assurance
- Documentation
- Understanding quality over quantity in data cleaning
- Summary
- Questions
- Chapter 4: The Most Common Data Cleaning Operations
- Technical requirements
- Removing duplicates
- Removing missing data
- Splitting columns
- Merging columns
- Replacing values
- Creating calculated columns versus measures
- Calculated columns
- Measures
- Calculation group
- Considerations
- Summary
- Questions
- Part 2 - Data Import and Query Editor
- Chapter 5: Importing Data into Power BI
- Technical requirements
- Understanding data completeness
- Understanding data accuracy
- Understanding data consistency.
- Assessing data relevance
- Assessing data formatting
- Assessing data normalization, denormalization, and star schemas
- Dimension modeling and star schema
- Denormalized data in dimension tables
- Summary
- Questions
- Chapter 6: Cleaning Data with Query Editor
- Technical requirements
- Understanding the Query Editor interface
- Data cleaning techniques and functions
- Adding columns
- Data type conversions
- Date/time
- Rounding
- Pivot/unpivot columns
- Merge queries
- Using Query Editor versus DAX for transformation
- Power Query Editor
- Data Analysis Expressions (DAX)
- Workflow
- Summary
- Questions
- Further reading
- Chapter 7: Transforming Data with the M Language
- Technical requirements
- Understanding the M language
- Structure of M
- Common use cases of M
- Filtering and sorting data with M
- Transforming data with M
- Working with data sources in M
- Creating parameters and variables
- Summary
- Questions
- Chapter 8: Using Data Profiling for Exploratory Data Analysis (EDA)
- Understanding EDA
- Exploring data profiling features in Power BI
- Reviewing column quality, distribution, and profile
- Column distribution
- Column quality
- Column profile
- Turning data profiles into high-quality data
- Recommended actions on column distribution
- Value distribution
- Summary
- Questions
- Part 3 - Advanced Data Cleaning and Optimizations
- Chapter 9: Advanced Data Cleaning Techniques
- Technical requirements
- Using Power Query Editor from within Dataflow Gen1 - fuzzy matching and fill down
- Fuzzy matching
- Fill down
- Best practices for using fuzzy matching and fill down
- Using R and Python scripts
- Benefits of using R or Python scripts
- Getting started with using R or Python scripts in Power BI
- Using ML to clean data
- Data cleaning with anomaly detection.
- Data preparation with AutoML
- Data enhancement with AI Insights
- Summary
- Questions
- Chapter 10: Creating Custom Functions in Power Query
- Planning for your custom function
- Defining the problem
- Identifying parameters
- Setting clear objectives
- Using parameters
- Types of parameters
- Defining parameters
- Best practices for using parameters
- Creating custom functions
- Defining the function structure
- Writing M code
- Testing and debugging
- Documentation
- Summary
- Questions
- Chapter 11: M Query Optimization
- Technical requirements
- Creating custom functions
- Filtering and reducing data
- Using native M functions
- Optimizing memory usage
- Parallel query execution
- Using Table.Buffer and Table.Split
- Summary
- Questions
- Further reading
- Chapter 12: Data Modeling and Managing Relationships
- Understanding the basics of data modeling
- Importing versus DirectQuery
- Dimensional modeling
- Snowflake schema
- Intermediate tables
- Calendars and date tables
- Role-playing dimensions
- Aggregating tables
- Incremental refreshes
- Using bidirectional cross-filtering
- What is bidirectional cross-filtering?
- Best practices for bidirectional cross-filtering
- Understanding what's the right cardinality
- Understanding cardinality
- Why cardinality matters
- Choosing the right cardinality
- Handling large and complex datasets
- Understanding big data
- Challenges of working with big data in Power BI
- Best practices for handling big data
- Avoiding circular references
- Understanding circular references
- Best practices for avoiding circular references
- Summary
- Questions
- Further reading
- Part 4 - Paginated Reports, Automations, and OpenAI
- Chapter 13: Preparing Data for Paginated Reporting
- Technical requirements
- Understanding the importance of paginated reports.
- Connecting to data sources within Power BI Report Builder
- Data preparation
- Query
- Fields
- Options
- Filters
- Parameters
- Creating a dataset example
- Using filters and parameters
- Using row groups/column groups
- Organizing and structuring data
- Enhancing readability and presentation
- Summary
- Questions
- Chapter 14: Automating Data Cleaning Tasks with Power Automate
- Technical requirements
- Handling triggers for automation
- Automating notifications
- Automating refreshing of data
- Creating snapshots (temporary tables) of cleaned data
- Best practices with Power Automate
- Summary
- Questions
- Further reading
- Chapter 15: Making Life Easier with OpenAI
- Optimizing efficiency with OpenAI, ChatGPT, and DAX
- Using OpenAI for M queries
- Using Microsoft Copilot
- Tackling challenges with AI
- Summary
- Questions
- Further reading
- Putting it together
- Assessments
- Chapter 1 - Introduction to Power BI Data Cleaning
- Chapter 2 - Understanding Data Quality and Why Data Cleaning is Important
- Chapter 3 - Data Cleaning Fundamentals and Principles
- Chapter 4 - The Most Common Data Cleaning Operations
- Chapter 5 - Importing Data into Power BI
- Chapter 6 - Cleaning Data with Query Editor
- Chapter 7 - Transforming Data with the M Language
- Chapter 8 - Using Data Profiling for Exploratory Data Analysis (EDA)
- Chapter 9 - Advanced Data Cleaning Techniques
- Chapter 10 - Creating Custom Functions in Power Query
- Chapter 11 - M Query Optimization
- Chapter 12 - Data Modeling and Managing Relationships
- Chapter 13 - Preparing Data for Paginated Reporting
- Chapter 14 - Automating Data Cleaning Tasks with Power Automate
- Chapter 15 - Making Life Easier with OpenAI
- Index
- About Packt
- Other Books You May Enjoy.