Experimentation for engineers from A/B testing to Bayesian optimization

Experimentation for Engineers: From A/B testing to Bayesian optimization is a toolbox of techniques for evaluating new features and fine-tuning parameters. You'll start with a deep dive into methods like A/B testing, and then graduate to advanced techniques used to measure performance in indust...

Full description

Bibliographic Details
Other Authors:	Sweet, David, author (author)
Format:	eBook
Language:	Inglés
Published:	Shelter Island, NY : Manning Publications Co [2023]
Edition:	[First edition]
Subjects:	Computer engineering > Experiments. Computer engineering > Handbooks, manuals, etc.
See on Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009724209506719

Table of Contents:

Intro
inside front cover
Experimentation for Engineers
Copyright
dedication
contents
front matter
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A road map
About the code
liveBook discussion forum
about the author
about the cover illustration
1 Optimizing systems by experiment
1.1 Examples of engineering workflows
1.1.1 Machine learning engineer's workflow
1.1.2 Quantitative trader's workflow
1.1.3 Software engineer's workflow
1.2 Measuring by experiment
1.2.1 Experimental methods
1.2.2 Practical problems and pitfalls
1.3 Why are experiments necessary?
1.3.1 Domain knowledge
1.3.2 Offline model quality
1.3.3 Simulation
Summary
2 A/B testing: Evaluating a modification to your system
2.1 Take an ad hoc measurement
2.1.1 Simulate the trading system
2.1.2 Compare execution costs
2.2 Take a precise measurement
2.2.1 Mitigate measurement variation with replication
2.3 Run an A/B test
2.3.1 Analyze your measurements
2.3.2 Design the A/B test
2.3.3 Measure and analyze
2.3.4 Recap of A/B test stages
Summary
3 Multi-armed bandits: Maximizing business metrics while experimenting
3.1 Epsilon-greedy: Account for the impact of evaluation on business metrics
3.1.1 A/B testing as a baseline
3.1.2 The epsilon-greedy algorithm
3.1.3 Deciding when to stop
3.2 Evaluating multiple system changes simultaneously
3.3 Thompson sampling: A more efficient MAB algorithm
3.3.1 Estimate the probability that an arm is the best
3.3.2 Randomized probability matching
3.3.3 The complete algorithm
Summary
4 Response surface methodology: Optimizing continuous parameters
4.1 Optimize a single continuous parameter
4.1.1 Design: Choose parameter values to measure.
4.1.2 Take the measurements
4.1.3 Analyze I: Interpolate between measurements
4.1.4 Analyze II: Optimize the business metric
4.1.5 Validate the optimal parameter value
4.2 Optimizing two or more continuous parameters
4.2.1 Design the two-parameter experiment
4.2.2 Measure, analyze, and validate the 2D experiment
Summary
5 Contextual bandits: Making targeted decisions
5.1 Model a business metric offline to make decisions online
5.1.1 Model the business-metric outcome of a decision
5.1.2 Add the decision-making component
5.1.3 Run and evaluate the greedy recommender
5.2 Explore actions with epsilon-greedy
5.2.1 Missing counterfactuals degrade predictions
5.2.2 Explore with epsilon-greedy to collect counterfactuals
5.3 Explore parameters with Thompson sampling
5.3.1 Create an ensemble of prediction models
5.3.2 Randomized probability matching
5.4 Validate the contextual bandit
Summary
6 Bayesian optimization: Automating experimental optimization
6.1 Optimizing a single compiler parameter, a visual explanation
6.1.1 Simulate the compiler
6.1.2 Run the initial experiment
6.1.3 Analyze: Model the response surface
6.1.4 Design: Select the parameter value to measure next
6.1.5 Design: Balance exploration with exploitation
6.2 Model the response surface with Gaussian process regression
6.2.1 Estimate the expected CPU time
6.2.2 Estimate uncertainty with GPR
6.3 Optimize over an acquisition function
6.3.1 Minimize the acquisition function
6.4 Optimize all seven compiler parameters
6.4.1 Random search
6.4.2 A complete Bayesian optimization
Summary
7 Managing business metrics
7.1 Focus on the business
7.1.1 Don't evaluate a model
7.1.2 Evaluate the product
7.2 Define business metrics
7.2.1 Be specific to your business.
7.2.2 Update business metrics periodically
7.2.3 Business metric timescales
7.3 Trade off multiple business metrics
7.3.1 Reduce negative side effects
7.3.2 Evaluate with multiple metrics
Summary
8 Practical considerations
8.1 Violations of statistical assumptions
8.1.1 Violation of the iid assumption
8.1.2 Nonstationarity
8.2 Don't stop early
8.3 Control family-wise error
8.3.1 Cherry-picking increases the false-positive rate
8.3.2 Control false positives with the Bonferroni correction
8.4 Be aware of common biases
8.4.1 Confounder bias
8.4.2 Small-sample bias
8.4.3 Optimism bias
8.4.4 Experimenter bias
8.5 Replicate to validate results
8.5.1 Validate complex experiments
8.5.2 Monitor changes with a reverse A/B test
8.5.3 Measure quarterly changes with holdouts
8.6 Wrapping up
Summary
Appendix A Linear regression and the normal equations
A.1 Univariate linear regression
A.2 Multivariate linear regression
Appendix B One factor at a time
Appendix C Gaussian process regression
index
inside back cover.

Experimentation for engineers from A/B testing to Bayesian optimization

Similar Items