Data Science for Decision Makers Enhance Your Leadership Skills with Data Science and AI Expertise

Bridge the gap between business and data science by learning how to interpret machine learning and AI models, manage data teams, and achieve impactful results Key Features Master the concepts of statistics and ML to interpret models and guide decisions Identify valuable AI use cases and manage data...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Howells, Jon, author (author)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing [2024]
Edición:	First edition
Materias:	Decision making > Data processing. Big data. Database management.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009841738506719

Tabla de Contenidos:

Cover
Title Page
Copyright and Credits
Contributors
Table of Contents
Preface
Part 1: Understanding Data Science and Its Foundations
Introducing Data Science
Data science, AI, and ML - what's the difference?
The mathematical and statistical underpinnings of data science
Statistics and data science
What is statistics?
Descriptive and inferential statistics
Sampling strategies
Probability
Probability distribution
Conditional probability
Describing our samples
Measures of central tendency
Measures of dispersion
Degrees of freedom
Correlation, causation, and covariance
The shape of data
Probability distributions
Discrete probability distributions
Continuous probability distributions
Summary
Characterizing and Collecting Data
What are the key criteria to consider when evaluating datasets?
Data quantity
Data velocity
Data variety
Data quality
First-, second-, and third-party data
First-party data - the treasure trove within
Second-party data - building bridges through collaboration
Third-party data - broadening horizons with external expertise
Structured, unstructured, and semi-structured data
Structured data
Unstructured data
Semi-structured data
Methods for collecting data
Storing and processing data
Cloud, on-premises, and hybrid solutions - navigating the data storage and analysis landscape
Cloud computing - scalable services in the cloud
On-premises - maintaining control within your walls
Hybrid - the best of both worlds?
Data processing
Summary
Exploratory Data Analysis
Getting started with Google Colab
What is Google Colab?
A step-by-step guide to setting up Google Colab
Understanding the data you have
EDA techniques and tools
Descriptive statistics
Data visualization.
Histograms
Density curves
Boxplots
Heatmaps
Dimensionality reduction
Correlation analysis
Outlier detection
Summary
The Significance of Significance
The idea of testing hypotheses
What is a hypothesis?
How does hypothesis testing work?
Formulating null and alternative hypotheses
Determining the significance level
Understanding errors
Getting to grips with p-values
Significance tests for a population proportion - making informed decisions about proportions
The z-test - comparing a sample proportion to a population proportion
Z-test example made easy
Significance tests for a population average (mean)
Writing hypotheses for a significance test about a mean
Conditions for a t-test about a mean
When to use z or t statistics in significance tests
Example - calculating the t-statistic for a test about a mean
Using a table to estimate the p-value from the t-statistic
Comparing the p-value from the t-statistic to the significance level
One-tailed and two-tailed tests
Walking through a case study
Summary
Understanding Regression
How can I benefit from understanding regression?
Introduction to trend lines
Fitting a trend line to data
Estimating the line of best fit
Calculating the equations of the lines of best fit
Interpreting the slope of a regression line
Interpreting the intercept of a regression line
Understanding residuals
Evaluating the goodness of fit in least-squares regression
Summary
Part 2: Machine Learning - Concepts, Applications, and Pitfalls
Introducing Machine Learning
From statistics to machine learning
What is machine learning?
How does machine learning relate to statistics?
Why is machine learning important?
Customer personalization and segmentation
Fraud detection and security.
Supply chain and inventory optimization
Predictive maintenance
Healthcare diagnostics and treatment
The different types of machine learning
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Transfer learning
Popular machine learning algorithms
Linear regression
Logistic regression
Decision trees
Random forests
Support vector machines
k-nearest neighbors
Neural networks
The machine learning process
Training a supervised machine learning model
Validation of a supervised machine learning model
Testing a supervised machine learning model
Evaluating machine learning models
Risks and limitations of machine learning
Overfitting and underfitting
Bias and variance
Balanced dataset
Models are approximations of reality
Machine learning on unstructured data
Natural language processing (NLP)
Computer vision
Deep learning and artificial intelligence
Artificial intelligence
Deep learning
Summary
Supervised Machine Learning
Defining supervised learning
Applications of supervised learning
The two types of supervised learning
Key factors in supervised learning
Steps within supervised learning
Data preparation - laying the foundation
Algorithm selection - choosing the right tool
Model training - learning from data
Model evaluation - assessing performance
Prediction and deployment - putting the model to work
Characteristics of regression and classification algorithms
Regression algorithms
Classification algorithms
Key considerations in supervised learning
Evaluation metrics
Applications of supervised learning
Consumer goods
Retail
Manufacturing
Summary
Unsupervised Machine Learning
Defining UL
Practical examples of UL
Steps in UL
Step 1 - Data collection.
Step 2 - Data preprocessing
Step 3 - Choosing the right model
Step 4 - Training the model
Step 5 - Interpretation and evaluation
In summary
Clustering - unveiling hidden patterns in your data
What is clustering?
How does clustering work?
k-means clustering
Practical applications of clustering
Evaluation metrics for clustering
In summary
Association rule learning
What is association rule learning?
The Apriori algorithm - a practical example
Evaluation metrics
In summary
Applications of UL
Market segmentation
Anomaly detection
Feature extraction
Summary
Interpreting and Evaluating Machine Learning Models
How do I know whether this model will be accurate?
Evaluating on test (holdout) data
Understanding evaluation metrics
Evaluating regression models
R-squared
Root mean squared error
Mean absolute error
When and how to use each metric
Practical evaluation strategies
Summarizing the evaluation of regression models
Evaluating classification models
Classification model evaluation metrics
Precision, recall, and F1-Score
Recall
F1-score
Methods for explaining machine learning models
Making sense of regression models - the power of coefficients
Decoding classification models - unveiling feature importance
Beyond specific models - universal insights using SHAP values
Summary
Common Pitfalls in Machine Learning
Understanding the complexity
Dirty data, damaged models - how data quantity and quality impact ML
The importance of adequate training data
Dealing with poor data quality
Conclusion
Overcoming overfitting and underfitting
Navigating training-serving skew and model drift
Ensuring fairness
Mastering overfitting and underfitting for optimal model performance.
Overfitting - when your model is too specific
Underfitting - when your model is too simplistic
Spotting the problem
Conclusion
Training-serving skew and model drift
Training-serving skew
Model drift
Key takeaways
Bias and fairness
Understanding bias
Understanding fairness
Mitigating bias and ensuring fairness
Key takeaways
Summary
Part 3: Leading Successful Data Science Projects and Teams
The Structure of a Data Science Project
The various types of data science projects
Data products
Reports and analytics
Research and methodology
The stages of a data product
Identifying use cases
Evaluating use cases
Planning the data product
Developing a data product
Data preparation and exploratory analysis
Model design and development
Evaluation and testing
Deploying and monitoring a data product
General best practices for data product development
Evaluating impact
Predictive maintenance in manufacturing
Fraud detection in banking
Customer churn prediction in telecom
Demand forecasting in retail
Personalized recommendations in e-commerce
Predictive maintenance in energy
Workforce optimization in quick service restaurants
Chatbot-assisted customer support
Summary
The Data Science Team
Assembling your data science team - key roles and considerations
Data scientists
Machine learning engineers
Data engineers
MLOps engineers
Analytics engineers
Software engineers (full stack, frontend, backend)
Product managers
Business analysts
Data storytellers/visualization experts
Considerations when assembling your team
Data science teams within larger organizations
The hub and spoke model
What is the hub and spoke model?
Practical applications of the hub and spoke model
Building a hub and spoke model.
The art of recruitment.

Data Science for Decision Makers Enhance Your Leadership Skills with Data Science and AI Expertise

Ejemplares similares