Privacy-Preserving Machine Learning A Use-Case-driven Approach to Building and Protecting ML Pipelines from Privacy and Security Threats

Privacy regulations are evolving each year and compliance with privacy regulations is mandatory for every enterprise. Machine learning engineers are required to not only analyze large amounts of data to gain crucial insights, but also comply with privacy regulations to protect sensitive data. This m...

Full description

Bibliographic Details
Main Author: Aravilli, Srinivas Rao (-)
Other Authors: Hamilton, Sam
Format: eBook
Language:Inglés
Published: Birmingham : Packt Publishing, Limited 2023.
Edition:1st ed
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009825857606719
Table of Contents:
  • Cover
  • Title Page
  • Copyright and Credits
  • Dedication
  • Foreword
  • Contributors
  • Table of Contents
  • Preface
  • Part 1: Introduction to Data Privacy and Machine Learning
  • Chapter 1: Introduction to Data Privacy, Privacy Breaches, and Threat Modeling
  • What do privacy and data privacy mean?
  • Privacy regulations
  • Privacy by Design and a case study
  • Example - Privacy by Design in a social media platform
  • Privacy breaches
  • Equifax privacy breach
  • Clearview AI Privacy breach
  • Privacy threat modeling
  • Privacy threat modeling - definition
  • The importance of privacy threat modeling
  • Privacy threat modeling's alignment to Privacy by Design principles
  • Steps in privacy threat modeling
  • Privacy threat modeling frameworks
  • The LINDDUN framework
  • Step 1 - modeling the system
  • Step 2 - eliciting and documenting threats
  • Step 3 - mitigating threats
  • The need for privacy-preserving ML
  • Case study - privacy-preserving ML in financial institutions
  • Summary
  • Chapter 2: Machine Learning Phases and Privacy Threats/Attacks in Each Phase
  • ML types
  • Supervised ML
  • Unsupervised ML
  • Reinforced ML
  • Overview of ML phases
  • The main phases of ML
  • Privacy threats/attacks in ML phases
  • Collaborative roles in ML projects
  • Privacy threats/attacks in ML
  • Membership inference attack
  • Model extraction attack
  • Reconstruction attacks-model inversion attacks
  • Model inversion attacks in neural networks
  • Summary
  • Part 2: Use Cases of Privacy-Preserving Machine Learning and a Deep Dive into Differential Privacy
  • Chapter 3: Overview of Privacy-Preserving Data Analysis and an Introduction to Differential Privacy
  • Privacy in data analysis
  • The need for privacy in data analysis
  • Privacy-preserving techniques
  • Data anonymization and algorithms for data anonymization
  • Data aggregation.
  • Privacy-enhancing technologies
  • Differential privacy
  • Federated learning
  • Secure multi-party computation (SMC)
  • Homomorphic encryption
  • Anonymization
  • De-identification
  • Differential privacy
  • Summary
  • Chapter 4: Overview of Differential Privacy Algorithms and Applications of Differential Privacy
  • Differential privacy algorithms
  • Laplace distribution
  • Gaussian distribution
  • Comparison of noise-adding algorithms to apply differential privacy
  • Generating aggregates using differential privacy
  • Sensitivity
  • Queries that use differential privacy
  • Clipping
  • Overview of real-life applications of differential privacy
  • Differential privacy usage at Uber
  • Differential privacy usage at Apple
  • Differential privacy usage in the US Census
  • Differential privacy at Google
  • Summary
  • Chapter 5: Developing Applications with Differential Privacy Using Open Source Frameworks
  • Open source frameworks to implement differential privacy
  • Introduction to the PyDP framework and its key features
  • Examples and demonstrations of PyDP in action
  • Developing a sample banking application with PyDP to showcase differential privacy techniques
  • Protecting against membership inference attacks
  • Applying differential privacy to large datasets
  • Use case - generating differentially private aggregates on a large dataset
  • PipelineDP high-level architecture
  • Tumult Analytics
  • Machine learning using differential privacy
  • Synthetic Dataset Generation: Introducing Fraudulent Transactions
  • Develop a classification model using scikit-learn
  • High-level implementation of the SGD algorithm
  • Applying differential privacy options using machine learning
  • Generating gradients using differential privacy
  • Clustering using differential privacy
  • Deep learning using differential privacy
  • Fraud detection model using PyTorch.
  • Fraud detection model with differential privacy using the Opacus framework
  • Differential privacy machine learning frameworks
  • Limitations of differential privacy and strategies to overcome them
  • Summary
  • Part 3: Hands-On Federated Learning
  • Chapter 6: Federated Learning and Implementing FL Using Open Source Frameworks
  • Federated learning
  • Preserving privacy
  • FL definition
  • Characteristics of FL
  • FL algorithms
  • FedSGD
  • FedAvg
  • Fed Adaptative Optimization
  • The steps involved in implementing FL
  • Open source frameworks to implement FL
  • TensorFlow Federated
  • Flower
  • An end-to-end use case of implementing fraud detection using FL
  • Developing an FL model for fraud detection using the Flower framework
  • FL with differential privacy
  • Approach one
  • Approach two
  • A sample application using FL-DP
  • Summary
  • Chapter 7: Federated Learning Benchmarks, Start-Ups, and the Next Opportunity
  • FL benchmarks
  • The importance of FL benchmarks
  • FL datasets
  • Frameworks for FL benchmarks
  • Selecting an FL framework for a project
  • A comparison of FedScale, FATE, Flower, and TensorFlow Federated
  • State-of-the-art research in FL
  • Communication-efficient FL
  • Privacy-preserving FL
  • Federated Meta-Learning
  • Adaptive FL
  • Federated reinforcement learning
  • Key company products related to FL
  • Summary
  • Part 4: Homomorphic Encryption, SMC, Confidential Computing, and LLMs
  • Chapter 8: Homomorphic Encryption and Secure Multiparty Computation
  • Encryption, anonymization, and de-identification
  • Data anonymization
  • De-identification
  • Exploring Homomorphic encryption
  • Ring-based
  • Lattice-based
  • Elliptic curve-based
  • Exploring the mathematics behind HE
  • Encryption
  • Homomorphism
  • Types of HE
  • Fully Homomorphic Encryption (FHE)
  • Somewhat Homomorphic Encryption (SHE).
  • Partially Homomorphic Encryption (PHE)
  • Paillier scheme
  • Pyfhel
  • SEAL Python
  • TenSEAL
  • phe
  • Implementing HE
  • Implementing PHE
  • Implementing HE using the TenSEAL library
  • Comparison of HE frameworks
  • Pyfhel
  • TenSEAL
  • PALISADE
  • PySEAL
  • TFHE
  • Machine learning with HE
  • Encrypted evaluation of ML models and inference
  • Limitations of HE
  • Secure Multiparty Computation
  • Basic principles of SMC
  • Applications of SMC
  • Techniques used for SMC
  • Implementing SMC - high-level steps
  • Python frameworks that can be used to implement SMC
  • Implementing Private Set Interaction (PSI) SMC - case study
  • Zero-knowledge proofs
  • Basic concepts
  • Types of ZKPs
  • Applications of ZKPs
  • Summary
  • Chapter 9: Confidential Computing - What, Why, and the Current State
  • Privacy/security attacks on data in memory
  • Data at rest
  • Data in motion
  • Data in memory
  • Confidential computation
  • What is confidential computing?
  • Benefits of confidential computing
  • Trusted execution environments - attestation of source code and how it helps protect against insider threat attacks
  • Industry standards for ML in TEEs
  • Confidential Computing Consortium
  • High-level comparison of Intel SGX, AWS Nitro Enclaves, Google Asylo, Azure enclaves, and Anjuna
  • Pros and cons of TEEs
  • Summary
  • Chapter 10: Preserving Privacy in Large Language Models
  • Key concepts/terms used in LLMs
  • Prompt example using ChatGPT (closed source LLM)
  • Prompt example using open source LLMs
  • Comparison of open source LLMs and closed source LLMs
  • AI standards and terminology of attacks
  • NIST
  • OWASP Top 10 for LLM applications
  • Privacy attacks on LLMs
  • Membership inference attacks against generative models
  • Extracting training data attack from generative models
  • Prompt injection attacks
  • Privacy-preserving technologies for LLMs.
  • Text attacks on ML models and LLMs
  • Private transformers - training LLMs using differential privacy
  • STOA - Privacy-preserving technologies for LLMs
  • Summary
  • Index
  • About Packt
  • Other Books You May Enjoy.