Doing data science
Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia Universit...
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Sebastapol, CA
O'Reilly
[2014]
|
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009623426006719 |
Tabla de Contenidos:
- Copyright
- Table of Contents
- Preface
- Motivation
- Origins of the Class
- Origins of the Book
- What to Expect from This Book
- How This Book Is Organized
- How to Read This Book
- How Code Is Used in This Book
- Who This Book Is For
- Prerequisites
- Supplemental Reading
- About the Contributors
- Conventions Used in This Book
- Using Code Examples
- Safari® Books Online
- How to Contact Us
- Acknowledgments
- Chapter1.Introduction: What Is Data Science?
- Big Data and Data Science Hype
- Getting Past the Hype
- Why Now?
- Datafication
- A Data Scientist's Role in This Process
- Thought Experiment: How Would You Simulate Chaos?
- Case Study: RealDirect
- How Does RealDirect Make Money?
- Exercise: RealDirect Data Strategy
- Chapter3.Algorithms
- Machine Learning Algorithms
- Three Basic Algorithms
- Linear Regression
- k-Nearest Neighbors (k-NN)
- k-means
- Exercise: Basic Machine Learning Algorithms
- Solutions
- Summing It All Up
- Thought Experiment: Automated Statistician
- Chapter4.Spam Filters, Naive Bayes, and Wrangling
- Thought Experiment: Learning by Example
- The Current Landscape (with a Little History)
- Data Science Jobs
- A Data Science Profile
- Thought Experiment: Meta-Definition
- OK, So What Is a Data Scientist, Really?
- In Academia
- In Industry
- Chapter2.Statistical Inference, Exploratory Data Analysis, and the Data Science Process
- Statistical Thinking in the Age of Big Data
- Statistical Inference
- Populations and Samples
- Populations and Samples of Big Data
- Big Data Can Mean Big Assumptions
- Modeling
- Exploratory Data Analysis
- Philosophy of Exploratory Data Analysis
- Exercise: EDA
- The Data Science Process
- Why Won't Linear Regression Work for Filtering Spam?
- How About k-nearest Neighbors?
- Naive Bayes
- Bayes Law
- A Spam Filter for Individual Words
- A Spam Filter That Combines Words: Naive Bayes
- Fancy It Up: Laplace Smoothing
- Comparing Naive Bayes to k-NN
- Sample Code in bash
- Scraping the Web: APIs and Other Tools
- Jake's Exercise: Naive Bayes for Article Classification
- Sample R Code for Dealing with the NYT API
- Chapter5.Logistic Regression
- Thought Experiments
- Classifiers
- Runtime
- You
- Interpretability
- Scalability
- M6D Logistic Regression Case Study