Cracking the Data Science Interview Unlock Insider Tips from Industry Experts to Master the Data Science Field

Rise above the competition and excel in your next interview with this one-stop guide to Python, SQL, version control, statistics, machine learning, and much more Key Features Acquire highly sought-after skills of the trade, including Python, SQL, statistics, and machine learning Gain the confidence...

Full description

Bibliographic Details
Other Authors: Gonzalez, Leondra R., author (author), Stubberfield, Aaren, author (writer of foreword), Baltes, Angela, writer of foreword
Format: eBook
Language:Inglés
Published: Birmingham, England : Packt Publishing [2024]
Edition:First edition
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009805127406719
Table of Contents:
  • Cover
  • Copyright
  • Foreword
  • Contributors
  • Table of Contents
  • Preface
  • Part 1: Breaking into the Data Science Field
  • Chapter 1: Exploring Today's Modern Data Science Landscape
  • What is data science?
  • Exploring the data science process
  • Data collection
  • Data exploration
  • Data modeling
  • Model evaluation
  • Model deployment and monitoring
  • Dissecting the flavors of data science
  • Data engineer
  • Dashboarding and visual specialist
  • ML specialist
  • Domain expert
  • Reviewing career paths in data science
  • The traditionalist
  • Domain expert
  • Off-the-beaten path-er
  • Tackling the experience bottleneck
  • Academic experience
  • Work experience
  • Understanding expected skills and competencies
  • Hard (technical) skills
  • Soft (communication) skills
  • Exploring the evolution of data science
  • New models
  • New environments
  • New computing
  • New applications
  • Summary
  • References
  • Chapter 2: Finding a Job in Data Science
  • Searching for your first data science job
  • Preparing for the road ahead
  • Finding job boards
  • Beginning to build a standout portfolio
  • Applying for jobs
  • Constructing the Golden Resume
  • The perfect resume myth
  • Understanding automated resume screening
  • Crafting an effective resume
  • Formatting and organization
  • Using the correct terminology
  • Prepping for landing the interview
  • Moore's Law
  • Research, research, research
  • Branding
  • References
  • Part 2: Manipulating and Managing Data
  • Chapter 3: Programming with Python
  • Using variables, data types, and data structures
  • Indexing in Python
  • Using string operations
  • Initializing a string
  • String indexing
  • Using Python control statements, loops, and list comprehensions
  • Conditional statements such as if, elif, and else
  • Loop statements such as for and while
  • List comprehension.
  • Using user-defined functions
  • Breaking down the user-defined function syntax
  • Doing "stuff" with user-defined functions
  • Getting familiar with lambda functions
  • Creating good functions
  • Handling files in Python
  • Opening files with pandas
  • Wrangling data with pandas
  • Handling missing data
  • Selecting data
  • Sorting data
  • Merging data
  • Aggregation with groupby()
  • Summary
  • References
  • Chapter 4: Visualizing Data and Data Storytelling
  • Understanding data visualization
  • Bar charts
  • Line charts
  • Scatter plots
  • Histograms
  • Density plots
  • Quantile-quantile plots (Q-Q plots)
  • Box plots
  • Pie charts
  • Surveying tools of the trade
  • Power BI
  • Tableau
  • Shiny
  • ggplot2 (R)
  • Matplotlib (Python)
  • Seaborn (Python)
  • Developing dashboards, reports, and KPIs
  • Developing charts and graphs
  • Bar chart - Matplotlib
  • Bar chart - Seaborn
  • Scatter plot - Matplotlib
  • Scatter plot - Seaborn
  • Histogram plot - Matplotlib
  • Histogram plot - Seaborn
  • Applying scenario-based storytelling
  • Summary
  • Chapter 5: Querying Databases with SQL
  • Introducing relational databases
  • Mastering SQL basics
  • The SELECT statement
  • The WHERE clause
  • The ORDER BY clause
  • Aggregating data with GROUP BY and HAVING
  • The GROUP BY statement
  • The HAVING clause
  • Creating fields with CASE WHEN
  • Analyzing subqueries and CTEs
  • Subqueries in the SELECT clause
  • Subqueries in the FROM clause
  • Subqueries in the WHERE clause
  • Subqueries in the HAVING clause
  • Distinguishing common table expressions (CTEs) from subqueries
  • Merging tables with joins
  • Inner joins
  • Left and right join
  • Full outer join
  • Multi-table joins
  • Calculating window functions
  • OVER, ORDER BY, PARTITION, and SET
  • LAG and LEAD
  • ROW_NUMBER
  • RANK and DENSE_RANK
  • Using date functions
  • Approaching complex queries
  • Summary.
  • Chapter 6: Scripting with Shell and Bash Commands in Linux
  • Introducing operating systems
  • Navigating system directories
  • Introducing basic command-line prompts
  • Understanding directory types
  • Filing and directory manipulation
  • Scripting with Bash
  • Introducing control statements
  • Creating functions
  • Processing data and pipelines
  • Using pipes
  • Using cron
  • Summary
  • Chapter 7: Using Git for Version Control
  • Introducing repositories (repos)
  • Creating a repo
  • Cloning an existing remote repository
  • Creating a local repository from scratch
  • Linking local and remote repositories
  • Detailing the Git workflow for data scientists
  • Using Git tags for data science
  • Understanding Git tags
  • Using tagging as a data scientist
  • Understanding common operations
  • Summary
  • Part 3: Exploring Artificial Intelligence
  • Chapter 8: Mining Data with Probability and Statistics
  • Describing data with descriptive statistics
  • Measuring central tendency
  • Measuring variability
  • Introducing populations and samples
  • Defining populations and samples
  • Representing samples
  • Reducing the sampling error
  • Understanding the Central Limit Thereom (CLT)
  • The CLT
  • Demonstrating the assumption of normality
  • Shaping data with sampling distributions
  • Probability distributions
  • Uniform distribution
  • Normal and student's t-distributions
  • The binomial distribution
  • The Poisson distribution
  • Exponential distribution
  • Geometric distribution
  • The Weibull distribution
  • Testing hypotheses
  • Understanding one-sample t-tests
  • Understanding two-sample t-tests
  • Understanding paired sample t-tests
  • Understanding ANOVA and MANOVA
  • Chi-squared test
  • A/B tests
  • Understanding Type I and Type II errors
  • Type I error (false positive)
  • Type II error (false negative)
  • Striking a balance
  • Summary
  • References.
  • Chapter 9: Understanding Feature Engineering and Preparing Data for Modeling
  • Chapter 10: Mastering Machine Learning Concepts
  • Introducing the machine learning workflow
  • Problem statement
  • Model selection
  • Model tuning
  • Model predictions
  • Getting started with supervised machine learning
  • Regression versus classification
  • Linear regression - regression
  • Logistic regression
  • k-nearest neighbors (k-NN)
  • Random forest
  • Extreme Gradient Boosting (XGBoost)
  • Getting started with unsupervised machine learning
  • K-means
  • Density-based spatial clustering of applications with noise (DBSCAN)
  • Other clustering algorithms
  • Evaluating clusters
  • Summarizing other notable machine learning models
  • Understanding the bias-variance trade-off
  • Tuning with hyperparameters
  • Grid search
  • Random search
  • Bayesian optimization
  • Summary
  • Chapter 11: Building Networks with Deep Learning
  • Introducing neural networks and deep learning
  • Weighing in on weights and biases
  • Introduction to weights
  • Introduction to biases
  • Activating neurons with activation functions
  • Common activation functions
  • Choosing the right activation function
  • Unraveling backpropagation
  • Gradient descent
  • What is backpropagation?
  • Loss functions
  • Gradient descent steps
  • The vanishing gradient problem
  • Using optimizers
  • Optimization algorithms
  • Network tuning
  • Understanding embeddings
  • Word embeddings
  • Training embeddings
  • Listing common network architectures
  • Common networks
  • Tools and packages
  • Introducing GenAI and LLMs
  • Unveiling language models
  • Transformers and self-attention
  • Transfer Learning
  • GPT in action
  • Summary
  • Chapter 12: Implementing Machine Learning Solutions with MLOps
  • Introducing MLOps
  • A model pipeline overview
  • Understanding data ingestion.
  • Learning the basics of data storage
  • Reviewing model development
  • Packaging for model deployment
  • Identifying requirements
  • Virtual environments
  • Tools and approaches for environment management
  • Deploying a model with containers
  • Using Docker
  • Validating and monitoring the model
  • Validating the model deployment
  • Model monitoring
  • Thinking about governance
  • Using Azure ML for MLOps
  • Summary
  • Part 4: Getting the Job
  • Chapter 13: Mastering the Interview Rounds
  • Mastering early interactions with the recruiter
  • Mastering the different interview stages
  • The hiring manager stage
  • The technical interview
  • Coding questions, step by step
  • The panel stage
  • Summary
  • References
  • Chapter 14: Negotiating Compensation
  • Understanding the compensation landscape
  • Negotiating the offer
  • Negotiation considerations
  • Responding to the offer
  • Maximum negotiable compensation and situational value
  • Summary
  • Final words
  • Index
  • Other Books You May Enjoy.