Extending Power BI with Python and R Perform Advanced Analysis Using the Power of Analytical Languages
The latest edition of this book delves deep into advanced analytics, focusing on enhancing Python and R proficiency within Power BI. New chapters cover optimizing Python and R settings, utilizing Intel's Math Kernel Library (MKL) for performance boosts, and addressing integration challenges. Te...
Other Authors: | , |
---|---|
Format: | eBook |
Language: | Inglés |
Published: |
Birmingham, England :
Packt Publishing Ltd
[2024]
|
Edition: | Second edition |
Series: | Expert insight.
|
Subjects: | |
See on Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009809015406719 |
Table of Contents:
- Cover
- Copyright
- Contributors
- Table of Contents
- Preface
- Chapter 1: Where and How to Use R and Python Scripts in Power BI
- Technical requirements
- Injecting R or Python scripts into Power BI
- Data loading
- Data transformation
- Data visualization
- Using R and Python to interact with your data
- Python and R compatibility across Power BI products
- Summary
- Test your knowledge
- Chapter 2: Configuring R with Power BI
- Technical requirements
- The available R engines
- The CRAN R distribution
- The Microsoft R Open distribution and MRAN
- Multi-threading in MRO
- Choosing an R engine to install
- The R engines used by Power BI
- Installing the suggested R engines
- The R engine for data transformation
- The R engine for R script visuals on the Power BI service
- What to do when the Power BI service upgrades the R engine
- Installing an IDE for R development
- Installing RStudio
- Installing RTools
- Linking Intel's MKL to R
- Configuring Power BI Desktop to work with R
- Debugging an R script visual
- Configuring the Power BI service to work with R
- Installing the on-premises data gateway in personal mode
- Sharing reports that use R scripts in the Power BI service
- R script visuals limitations
- Summary
- Test your knowledge
- Chapter 3: Configuring Python with Power BI
- Technical requirements
- The available Python engines
- Choosing a Python engine to install
- The Python engines used by Power BI
- Installing the suggested Python engines
- The Python engine for data transformation
- Creating an environment for data transformations using pip
- Creating an optimized environment for data transformations using conda
- Creating an environment for Python script visuals on the Power BI service
- What to do when the Power BI service upgrades the Python engine.
- Installing an IDE for Python development
- Configuring Python with RStudio
- Configuring Python with Visual Studio Code
- Working with the Python Interactive window in Visual Studio Code
- Configuring Power BI Desktop to work with Python
- Configuring the Power BI service to work with Python
- Sharing reports that use Python scripts in the Power BI service
- Limitations of Python visuals
- Summary
- Test your knowledge
- Chapter 4: Solving Common Issues When Using Python and R in Power BI
- Technical requirements
- Avoiding the ADO.NET error when running a Python script in Power BI
- The real cause of the problem
- A practical solution to the problem
- Avoiding the Formula.Firewall error
- Incompatible privacy levels
- Indirect access to a data source
- The easy way
- Combining queries and/or transformations
- Encapsulating queries into functions
- Using multiple datasets in Python and R script steps
- Applying a full join with Merge
- Using arguments of the Python.Execute function
- Dealing with dates/times in Python and R script steps
- Summary
- Test your knowledge
- Chapter 5: Importing Unhandled Data Objects
- Technical requirements
- Importing RDS files in R
- A brief introduction to Tidyverse
- Creating a serialized R object
- Configuring the environment and installing Tidyverse
- Creating the RDS files
- Using an RDS file in Power BI
- Importing an RDS file into the Power Query Editor
- Importing an RDS file in an R script visual
- Importing PKL files in Python
- A very short introduction to the PyData world
- Creating a serialized Python object
- Configuring the environment and installing the PyData packages
- Creating the PKL files
- Using a PKL file in Power BI
- Importing a PKL file into the Power Query Editor
- Importing a PKL file in a Python script visual
- Summary
- References.
- Test your knowledge
- Chapter 6: Using Regular Expressions in Power BI
- Technical requirements
- A brief introduction to regexes
- The basics of regexes
- Literal characters
- Special characters in regex
- The ^ and anchors
- OR operators
- Negated character classes
- Shorthand character classes
- Quantifiers
- The dot
- Greedy and lazy matches
- Checking the validity of email addresses
- Checking the validity of dates
- Validating data using regex in Power BI
- Using regex in Power BI to validate emails with Python
- Using regex in Power BI to validate emails with R
- Using regex in Power BI to validate dates with Python
- Using regex in Power BI to validate dates with R
- Loading complex log files using regex in Power BI
- Apache access logs
- Importing Apache access logs in Power BI with Python
- Importing Apache access logs in Power BI with R
- Extracting values from text using regex in Power BI
- One regex to rule them all
- Using regex in Power BI to extract values with Python
- Using regex in Power BI to extract values with R
- Summary
- References
- Test your knowledge
- Chapter 7: Anonymizing and Pseudonymizing Your Data in Power BI
- Technical requirements
- De-identifying data
- De-identification techniques
- Information removal
- Data masking
- Data swapping
- Generalization
- Data perturbation
- Tokenization
- Hashing
- Encryption
- Understanding pseudonymization
- What is anonymization?
- Anonymizing data in Power BI
- Anonymizing data using Python
- Anonymizing data using R
- Pseudonymizing data in Power BI
- Pseudonymizing data using Python
- Pseudonymizing data using R
- Summary
- References
- Test your knowledge
- Chapter 8: Logging Data from Power BI to External Sources
- Technical requirements
- Logging to CSV files
- Logging to CSV files with Python.
- Using the pandas module
- Logging emails to CSV files in Power BI with Python
- Logging to CSV files with R
- Using Tidyverse functions
- Logging dates to CSV files in Power BI with R
- Logging to Excel files
- Logging to Excel files with Python
- Using the pandas module
- Logging emails and dates to Excel files in Power BI with Python
- Logging to Excel files with R
- Using the readxl and openxlsx packages
- Logging emails and dates to Excel in Power BI with R
- Logging to (Azure) SQL Server
- Installing SQL Server Express
- Creating an Azure SQL Database
- Logging to an (Azure) SQL server with Python
- Using the pyodbc module
- Logging emails and dates to an Azure SQL Database in Power BI with Python
- Logging to an (Azure) SQL Server with R
- Using the DBI and odbc packages
- Logging emails and dates to an Azure SQL Database in Power BI with R
- Managing credentials in the code
- Creating environment variables
- Using environment variables in Python
- Using environment variables in R
- Summary
- References
- Test your knowledge
- Chapter 9: Loading Large Datasets Beyond the Available RAM in Power BI
- Technical requirements
- A typical analytic scenario using large datasets
- Importing large datasets with Python
- Installing Dask on your laptop
- Creating a Dask DataFrame
- Extracting information from a Dask DataFrame
- Importing a large dataset in Power BI with Python
- Importing large datasets with R
- Introducing Apache Arrow
- Installing arrow on your laptop
- Creating and extracting information from an Arrow Dataset object
- Importing a large dataset in Power BI with R
- Summary
- References
- Test your knowledge
- Chapter 10: Boosting Data Loading Speed in Power BI with Parquet Format
- Technical requirements
- From CSV to the Parquet file format.
- Limitations of using Parquet files natively in Power BI
- Using Parquet files with Python
- Analyzing Parquet data with Dask
- Analyzing Parquet data with PyArrow
- Performance differences between Dask and PyArrow
- Using Parquet files with R
- Analyzing Parquet data with Arrow for R
- Using the Parquet format to speed up a Power BI report
- Transforming historical data in Parquet
- Appending new data to and analyzing the Parquet dataset
- Analyzing Parquet data in Power BI with Python
- Analyzing Parquet data in Power BI with R
- Summary
- References
- Test your knowledge
- Chapter 11: Calling External APIs to EnrichYour Data
- Technical requirements
- What is a web service?
- Registering for Bing Maps web services
- Geocoding addresses using Python
- Using an explicit GET request
- Using an explicit GET request in parallel
- Using the Geocoder library in parallel
- Geocoding addresses using R
- Using an explicit GET request
- Using an explicit GET request in parallel
- Using the tidygeocoder package in parallel
- Accessing web services using Power BI
- Geocoding addresses in Power BI with Python
- Geocoding addresses in Power BI with R
- Summary
- References
- Test your knowledge
- Chapter 12: Calculating Columns Using Complex Algorithms: Distances
- Technical requirements
- What is a distance?
- The distance between two geographic locations
- Some theory first
- Spherical trigonometry
- The law of Cosines distance
- The law of Haversines distance
- Vincenty's distance
- What kind of distance to use and when
- Implementing distances using Python
- Calculating distances with Python
- Calculating distances in Power BI with Python
- Implementing distances using R
- Calculating distances with R
- Calculating distances in Power BI with R
- The distance between two strings
- Some theory first.
- The Hamming distance.