Extending Power BI with Python and R Perform Advanced Analysis Using the Power of Analytical Languages

The latest edition of this book delves deep into advanced analytics, focusing on enhancing Python and R proficiency within Power BI. New chapters cover optimizing Python and R settings, utilizing Intel's Math Kernel Library (MKL) for performance boosts, and addressing integration challenges. Te...

Full description

Bibliographic Details
Other Authors: Zavarella, Luca, author (author), Talwar, Rajat, author
Format: eBook
Language:Inglés
Published: Birmingham, England : Packt Publishing Ltd [2024]
Edition:Second edition
Series:Expert insight.
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009809015406719
Table of Contents:
  • Cover
  • Copyright
  • Contributors
  • Table of Contents
  • Preface
  • Chapter 1: Where and How to Use R and Python Scripts in Power BI
  • Technical requirements
  • Injecting R or Python scripts into Power BI
  • Data loading
  • Data transformation
  • Data visualization
  • Using R and Python to interact with your data
  • Python and R compatibility across Power BI products
  • Summary
  • Test your knowledge
  • Chapter 2: Configuring R with Power BI
  • Technical requirements
  • The available R engines
  • The CRAN R distribution
  • The Microsoft R Open distribution and MRAN
  • Multi-threading in MRO
  • Choosing an R engine to install
  • The R engines used by Power BI
  • Installing the suggested R engines
  • The R engine for data transformation
  • The R engine for R script visuals on the Power BI service
  • What to do when the Power BI service upgrades the R engine
  • Installing an IDE for R development
  • Installing RStudio
  • Installing RTools
  • Linking Intel's MKL to R
  • Configuring Power BI Desktop to work with R
  • Debugging an R script visual
  • Configuring the Power BI service to work with R
  • Installing the on-premises data gateway in personal mode
  • Sharing reports that use R scripts in the Power BI service
  • R script visuals limitations
  • Summary
  • Test your knowledge
  • Chapter 3: Configuring Python with Power BI
  • Technical requirements
  • The available Python engines
  • Choosing a Python engine to install
  • The Python engines used by Power BI
  • Installing the suggested Python engines
  • The Python engine for data transformation
  • Creating an environment for data transformations using pip
  • Creating an optimized environment for data transformations using conda
  • Creating an environment for Python script visuals on the Power BI service
  • What to do when the Power BI service upgrades the Python engine.
  • Installing an IDE for Python development
  • Configuring Python with RStudio
  • Configuring Python with Visual Studio Code
  • Working with the Python Interactive window in Visual Studio Code
  • Configuring Power BI Desktop to work with Python
  • Configuring the Power BI service to work with Python
  • Sharing reports that use Python scripts in the Power BI service
  • Limitations of Python visuals
  • Summary
  • Test your knowledge
  • Chapter 4: Solving Common Issues When Using Python and R in Power BI
  • Technical requirements
  • Avoiding the ADO.NET error when running a Python script in Power BI
  • The real cause of the problem
  • A practical solution to the problem
  • Avoiding the Formula.Firewall error
  • Incompatible privacy levels
  • Indirect access to a data source
  • The easy way
  • Combining queries and/or transformations
  • Encapsulating queries into functions
  • Using multiple datasets in Python and R script steps
  • Applying a full join with Merge
  • Using arguments of the Python.Execute function
  • Dealing with dates/times in Python and R script steps
  • Summary
  • Test your knowledge
  • Chapter 5: Importing Unhandled Data Objects
  • Technical requirements
  • Importing RDS files in R
  • A brief introduction to Tidyverse
  • Creating a serialized R object
  • Configuring the environment and installing Tidyverse
  • Creating the RDS files
  • Using an RDS file in Power BI
  • Importing an RDS file into the Power Query Editor
  • Importing an RDS file in an R script visual
  • Importing PKL files in Python
  • A very short introduction to the PyData world
  • Creating a serialized Python object
  • Configuring the environment and installing the PyData packages
  • Creating the PKL files
  • Using a PKL file in Power BI
  • Importing a PKL file into the Power Query Editor
  • Importing a PKL file in a Python script visual
  • Summary
  • References.
  • Test your knowledge
  • Chapter 6: Using Regular Expressions in Power BI
  • Technical requirements
  • A brief introduction to regexes
  • The basics of regexes
  • Literal characters
  • Special characters in regex
  • The ^ and anchors
  • OR operators
  • Negated character classes
  • Shorthand character classes
  • Quantifiers
  • The dot
  • Greedy and lazy matches
  • Checking the validity of email addresses
  • Checking the validity of dates
  • Validating data using regex in Power BI
  • Using regex in Power BI to validate emails with Python
  • Using regex in Power BI to validate emails with R
  • Using regex in Power BI to validate dates with Python
  • Using regex in Power BI to validate dates with R
  • Loading complex log files using regex in Power BI
  • Apache access logs
  • Importing Apache access logs in Power BI with Python
  • Importing Apache access logs in Power BI with R
  • Extracting values from text using regex in Power BI
  • One regex to rule them all
  • Using regex in Power BI to extract values with Python
  • Using regex in Power BI to extract values with R
  • Summary
  • References
  • Test your knowledge
  • Chapter 7: Anonymizing and Pseudonymizing Your Data in Power BI
  • Technical requirements
  • De-identifying data
  • De-identification techniques
  • Information removal
  • Data masking
  • Data swapping
  • Generalization
  • Data perturbation
  • Tokenization
  • Hashing
  • Encryption
  • Understanding pseudonymization
  • What is anonymization?
  • Anonymizing data in Power BI
  • Anonymizing data using Python
  • Anonymizing data using R
  • Pseudonymizing data in Power BI
  • Pseudonymizing data using Python
  • Pseudonymizing data using R
  • Summary
  • References
  • Test your knowledge
  • Chapter 8: Logging Data from Power BI to External Sources
  • Technical requirements
  • Logging to CSV files
  • Logging to CSV files with Python.
  • Using the pandas module
  • Logging emails to CSV files in Power BI with Python
  • Logging to CSV files with R
  • Using Tidyverse functions
  • Logging dates to CSV files in Power BI with R
  • Logging to Excel files
  • Logging to Excel files with Python
  • Using the pandas module
  • Logging emails and dates to Excel files in Power BI with Python
  • Logging to Excel files with R
  • Using the readxl and openxlsx packages
  • Logging emails and dates to Excel in Power BI with R
  • Logging to (Azure) SQL Server
  • Installing SQL Server Express
  • Creating an Azure SQL Database
  • Logging to an (Azure) SQL server with Python
  • Using the pyodbc module
  • Logging emails and dates to an Azure SQL Database in Power BI with Python
  • Logging to an (Azure) SQL Server with R
  • Using the DBI and odbc packages
  • Logging emails and dates to an Azure SQL Database in Power BI with R
  • Managing credentials in the code
  • Creating environment variables
  • Using environment variables in Python
  • Using environment variables in R
  • Summary
  • References
  • Test your knowledge
  • Chapter 9: Loading Large Datasets Beyond the Available RAM in Power BI
  • Technical requirements
  • A typical analytic scenario using large datasets
  • Importing large datasets with Python
  • Installing Dask on your laptop
  • Creating a Dask DataFrame
  • Extracting information from a Dask DataFrame
  • Importing a large dataset in Power BI with Python
  • Importing large datasets with R
  • Introducing Apache Arrow
  • Installing arrow on your laptop
  • Creating and extracting information from an Arrow Dataset object
  • Importing a large dataset in Power BI with R
  • Summary
  • References
  • Test your knowledge
  • Chapter 10: Boosting Data Loading Speed in Power BI with Parquet Format
  • Technical requirements
  • From CSV to the Parquet file format.
  • Limitations of using Parquet files natively in Power BI
  • Using Parquet files with Python
  • Analyzing Parquet data with Dask
  • Analyzing Parquet data with PyArrow
  • Performance differences between Dask and PyArrow
  • Using Parquet files with R
  • Analyzing Parquet data with Arrow for R
  • Using the Parquet format to speed up a Power BI report
  • Transforming historical data in Parquet
  • Appending new data to and analyzing the Parquet dataset
  • Analyzing Parquet data in Power BI with Python
  • Analyzing Parquet data in Power BI with R
  • Summary
  • References
  • Test your knowledge
  • Chapter 11: Calling External APIs to EnrichYour Data
  • Technical requirements
  • What is a web service?
  • Registering for Bing Maps web services
  • Geocoding addresses using Python
  • Using an explicit GET request
  • Using an explicit GET request in parallel
  • Using the Geocoder library in parallel
  • Geocoding addresses using R
  • Using an explicit GET request
  • Using an explicit GET request in parallel
  • Using the tidygeocoder package in parallel
  • Accessing web services using Power BI
  • Geocoding addresses in Power BI with Python
  • Geocoding addresses in Power BI with R
  • Summary
  • References
  • Test your knowledge
  • Chapter 12: Calculating Columns Using Complex Algorithms: Distances
  • Technical requirements
  • What is a distance?
  • The distance between two geographic locations
  • Some theory first
  • Spherical trigonometry
  • The law of Cosines distance
  • The law of Haversines distance
  • Vincenty's distance
  • What kind of distance to use and when
  • Implementing distances using Python
  • Calculating distances with Python
  • Calculating distances in Power BI with Python
  • Implementing distances using R
  • Calculating distances with R
  • Calculating distances in Power BI with R
  • The distance between two strings
  • Some theory first.
  • The Hamming distance.