Public policy analytics code and context for data science in government
Public Policy Analytics: Code & Context for Data Science in Government teaches readers how to address complex public policy problems with data and analytics using reproducible methods in R. Each of the eight chapters provides a detailed case study, showing readers: how to develop exploratory ind...
Otros Autores: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Boca Raton :
CRC Press
2021.
|
Edición: | 1st ed |
Colección: | Chapman & Hall/CRC data science series.
|
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009669528806719 |
Tabla de Contenidos:
- <P>Preface </p><p>Introduction <br><br><strong>Indicators for Transit Oriented Development </strong><br>1.1 Why Start With Indicators? <br>1.1.1 Mapping & scale bias in areal aggregate data <br>1.2 Setup <br>1.2.1 Downloading & wrangling Census data <br>1.2.2 Wrangling transit open data <br>1.2.3 Relating tracts & subway stops in space <br>1.3 Developing TOD Indicators <br>1.3.1 TOD indicator maps <br>1.3.2 TOD indicator tables <br>1.3.3 TOD indicator plots <br>1.4 Capturing three submarkets of interest <br>1.5 Conclusion: Are Philadelphians willing to pay for TOD? <br>1.6 Assignment
- Study TOD in your city <br><br><br><strong>Expanding the Urban Growth Boundary</strong><br>2.1 Introduction
- Lancaster development<br>2.1.1 The bid-rent model<br>2.1.2 Setup Lancaster data <br>2.2 Identifying areas inside & outside of the Urban Growth Area <br>2.2.1 Associate each inside/outside buffer with its respective town<br>2.2.2 Building density by town & by inside/outside the UGA <br>2.2.3 Visualize buildings inside & outside the UGA<br>2.3 Return to Lancaster's Bid Rent <br>2.4 Conclusion
- On boundaries <br>2.5 Assignment
- Boundaries in your community </p><p><strong>Intro to geospatial machine learning, Part 1 </strong><br>3.1 Machine learning as a Planning tool <br>3.1.1 Accuracy & generalizability <br>3.1.2 The machine learning process <br>3.1.3 The hedonic model <br>3.2 Data wrangling
- Home price & crime data <br>3.2.1 Feature Engineering
- Measuring exposure to crime <br>3.2.2 Exploratory analysis: Correlation<br>3.3 Introduction to Ordinary Least Squares Regression <br>3.3.1 Our first regression model<br>3.3.2 More feature engineering & colinearity <br>3.4 Cross-validation & return to goodness of fit<br>3.4.1 Accuracy
- Mean Absolute Error <br>3.4.2 Generalizability
- Cross-validation <br>3.5 Conclusion
- Our first model <br>3.6 Assignment
- Predict house prices </p><p><strong>Intro to geospatial machine learning, Part 2</strong><br>4.1 On the spatial process of home prices <br>4.1.1 Setup & Data Wrangling <br>4.2 Do prices & errors cluster? The Spatial Lag<br>4.2.1 Do model errors cluster?
- Moran's I<br>4.3 Accounting for neighborhood <br>4.3.1 Accuracy of the neighborhood model <br>4.3.2 Spatial autocorrelation in the neighborhood model <br>4.3.3 Generalizability of the neighborhood model<br>4.4 Conclusion
- Features at multiple scales</p><p><strong>Geospatial risk modeling
- Predictive Policing </strong><br>5.1 New predictive policing tools <br>5.1.1 Generalizability in geospatial risk models <br>5.1.2 From Broken Windows Theory to Broken Windows Policing <br>5.1.3 Setup <br>5.2 Data wrangling: Creating the fishnet<br>5.2.1 Data wrangling: Joining burglaries to the fishnet <br>5.2.2 Wrangling risk factors <br>5.3 Feature engineering
- Count of risk factors by grid cell <br>5.3.1 Feature engineering
- Nearest neighbor features <br>5.3.2 Feature Engineering
- Measure distance to one point <br>5.3.3 Feature Engineering
- Create the final-net <br>5.4 Exploring the spatial process of burglary <br>5.4.1 Correlation tests <br>5.5 Poisson Regression <br>5.5.1 Cross-validated Poisson Regression <br>5.5.2 Accuracy & Generalzability <br>5.5.3 Generalizability by neighborhood context<br>5.5.4 Does this model allocate better than traditional crime hotspots? <br>5.6 Conclusion
- Bias but useful? <br>5.7 Assignment
- Predict risk </p><p><strong>People-based ML models</strong><br>6.1 Bounce to work<br>6.2 Exploratory analysis <br>6.3 Logistic regression<br>6.3.1 Training/Testing sets <br>6.3.2 Estimate a churn model <br>6.4 Goodness of Fit <br>6.4.1 Roc Curves <br>6.5 Cross-validation <br>6.6 Generating costs and benefits <br>6.6.1 Optimizing the cost/benefit relationship <br>6.7 Conclusion
- churn <br>6.8 Assignment
- Target a subsidy </p><p><strong>People-Based ML Models: Algorithmic Fairness</strong><br>7.1 Introduction <br>7.1.1 The spectre of disparate impact <br>7.1.2 Modeling judicial outcomes <br>7.1.3 Accuracy and generalizability in recidivism algorithms <br>7.2 Data and exploratory analysis <br>7.3 Estimate two recidivism models <br>7.3.1 Accuracy & Generalizability <br>7.4 What about the threshold?</p><p>7.5 Optimizing 'equitable' thresholds <br>7.6 Assignment
- Memo to the Mayor </p><p><br><strong>Predicting rideshare demand</strong><br>8.1 Introduction
- ride share <br>8.2 Data Wrangling
- ride share <br>8.2.1 Lubridate<br>8.2.2 Weather data <br>8.2.3 Subset a study area using neighborhoods <br>8.2.4 Create the final space/time panel <br>8.2.5 Split training and test<br>8.2.6 What about distance features? <br>8.3 Exploratory Analysis
- ride share <br>8.3.1 Trip-Count serial autocorrelation <br>8.3.2 Trip-Count spatial autocorrelation <br>8.3.3 Space/time correlation? <br>8.3.4 Weather<br>8.4 Modeling and validation using purrr::map<br>8.4.1 A short primer on nested tibbles <br>8.4.2 Estimate a ride share forecast <br>8.4.3 Validate test set by time <br>8.4.4 Validate test set by space <br>8.5 Conclusion
- Dispatch<br>8.6 Assignment
- Predict bike share trips</p><p>Conclusion
- Algorithmic Governance </p><p>Index</p>