Doing data science

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia Universit...

Descripción completa

Detalles Bibliográficos
Otros Autores: O'Neil, Cathy, author (author), Schutt, Rachel, 1976- author
Formato: Libro electrónico
Idioma:Inglés
Publicado: Sebastapol, CA O'Reilly [2014]
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009623426006719
Tabla de Contenidos:
  • Copyright
  • Table of Contents
  • Preface
  • Motivation
  • Origins of the Class
  • Origins of the Book
  • What to Expect from This Book
  • How This Book Is Organized
  • How to Read This Book
  • How Code Is Used in This Book
  • Who This Book Is For
  • Prerequisites
  • Supplemental Reading
  • About the Contributors
  • Conventions Used in This Book
  • Using Code Examples
  • Safari® Books Online
  • How to Contact Us
  • Acknowledgments
  • Chapter1.Introduction: What Is Data Science?
  • Big Data and Data Science Hype
  • Getting Past the Hype
  • Why Now?
  • Datafication
  • A Data Scientist's Role in This Process
  • Thought Experiment: How Would You Simulate Chaos?
  • Case Study: RealDirect
  • How Does RealDirect Make Money?
  • Exercise: RealDirect Data Strategy
  • Chapter3.Algorithms
  • Machine Learning Algorithms
  • Three Basic Algorithms
  • Linear Regression
  • k-Nearest Neighbors (k-NN)
  • k-means
  • Exercise: Basic Machine Learning Algorithms
  • Solutions
  • Summing It All Up
  • Thought Experiment: Automated Statistician
  • Chapter4.Spam Filters, Naive Bayes, and Wrangling
  • Thought Experiment: Learning by Example
  • The Current Landscape (with a Little History)
  • Data Science Jobs
  • A Data Science Profile
  • Thought Experiment: Meta-Definition
  • OK, So What Is a Data Scientist, Really?
  • In Academia
  • In Industry
  • Chapter2.Statistical Inference, Exploratory Data Analysis, and the Data Science Process
  • Statistical Thinking in the Age of Big Data
  • Statistical Inference
  • Populations and Samples
  • Populations and Samples of Big Data
  • Big Data Can Mean Big Assumptions
  • Modeling
  • Exploratory Data Analysis
  • Philosophy of Exploratory Data Analysis
  • Exercise: EDA
  • The Data Science Process
  • Why Won't Linear Regression Work for Filtering Spam?
  • How About k-nearest Neighbors?
  • Naive Bayes
  • Bayes Law
  • A Spam Filter for Individual Words
  • A Spam Filter That Combines Words: Naive Bayes
  • Fancy It Up: Laplace Smoothing
  • Comparing Naive Bayes to k-NN
  • Sample Code in bash
  • Scraping the Web: APIs and Other Tools
  • Jake's Exercise: Naive Bayes for Article Classification
  • Sample R Code for Dealing with the NYT API
  • Chapter5.Logistic Regression
  • Thought Experiments
  • Classifiers
  • Runtime
  • You
  • Interpretability
  • Scalability
  • M6D Logistic Regression Case Study