Python Real-World Projects Craft Your Python Portfolio with Deployable Applications

Detalles Bibliográficos
Otros Autores:	Lott, Steven F., author (author)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing Ltd [2023]
Edición:	First edition
Materias:	Computer programming. Python (Computer program language)
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009767135306719

Tabla de Contenidos:

Intro
Title Page
Copyright and Credits
Contributors
Table of Contents
Preface
A note on skills required
Chapter 1: Project Zero: A Template for Other Projects
On quality
More Reading on Quality
Suggested project sprints
Inception
Elaboration, part 1: define done
Elaboration, part 2: define components and tests
Construction
Transition
List of deliverables
Development tool installation
Project 0 - Hello World with test cases
Description
Approach
Deliverables
The pyproject.toml project file
The docs directory
The tests/features/hello_world.feature file
The tests/steps/hw_cli.py module
The tests/environment.py file
The tests/test_hw.py unit tests
The src/tox.ini file
The src/hello_world.py file
Definition of done
Summary
Extras
Static analysis - mypy, flake8
CLI features
Logging
Cookiecutter
Chapter 2: Overview of the Projects
General data acquisition
Acquisition via Extract
Inspection
Clean, validate, standardize, and persist
Summarize and analyze
Statistical modeling
Data contracts
Summary
Chapter 3: Project 1.1: Data Acquisition Base Application
Description
User experience
About the source data
About the output data
Architectural approach
Class design
Design principles
Functional design
Deliverables
Acceptance tests
Additional acceptance scenarios
Unit tests
Unit testing the model
Unit testing the PairBuilder class hierarchy
Unit testing the remaining components
Summary
Extras
Logging enhancements
Configuration extensions
Data subsets
Another example data source
Chapter 4: Data Acquisition Features: Web APIs and Scraping
Project 1.2: Acquire data from a web service
Description
The Kaggle API
About the source data
Approach.
Making API requests
Downloading a ZIP archive
Getting the data set list
Rate limiting
The main() function
Deliverables
Unit tests for the RestAccess class
Acceptance tests
The feature file
Injecting a mock for the requests package
Creating a mock service
Behave fixture
Kaggle access module and refactored main application
Project 1.3: Scrape data from a web page
Description
About the source data
Approach
Making an HTML request with urllib.request
HTML scraping and Beautiful Soup
Deliverables
Unit test for the html_extract module
Acceptance tests
HTML extract module and refactored main application
Summary
Extras
Locate more JSON-format data
Other data sets to extract
Handling schema variations
CLI enhancements
Logging
Chapter 5: Data Acquisition Features: SQL Database
Project 1.4: A local SQL database
Description
Database design
Data loading
Approach
SQL Data Definitions
SQL Data Manipulations
SQL Execution
Loading the SERIES table
Loading the SERIES_VALUE table
Deliverables
Project 1.5: Acquire data from a SQL extract
Description
The Object-Relational Mapping (ORM) problem
About the source data
Approach
Extract from a SQL DB
SQL-related processing distinct from CSV processing
Deliverables
Mock database connection and cursor objects for testing
Unit test for a new acquisition module
Acceptance tests using a SQLite database
The feature file
The sqlite fixture
The step definitions
The Database extract module, and refactoring
Summary
Extras
Consider using another database
Consider using a NoSQL database
Consider using SQLAlchemy to define an ORM layer
Chapter 6: Project 2.1: Data Inspection Notebook
Description
About the source data
Approach.
Notebook test cases for the functions
Common code in a separate module
Deliverables
Notebook .ipynb file
Cells and functions to analyze data
Cells with Markdown to explain things
Cells with test cases
Executing a notebook's test suite
Summary
Extras
Use pandas to examine data
Chapter 7: Data Inspection Features
Project 2.2: Validating cardinal domains - measures, counts, and durations
Description
Approach
Dealing with currency and related values
Dealing with intervals or durations
Extract notebook functions
Deliverables
Inspection module
Unit test cases for the module
Project 2.3: Validating text and codes - nominal data and ordinal numbers
Description
Dates and times
Time values, local time, and UTC time
Approach
Nominal data
Extend the data inspection module
Deliverables
Revised inspection module
Unit test cases
Project 2.4: Finding reference domains
Description
Approach
Collect and compare keys
Summarize keys counts
Deliverables
Revised inspection module
Unit test cases
Revised notebook to use the refactored inspection model
Summary
Extras
Markdown cells with dates and data source information
Presentation materials
JupyterBook or Quarto for even more sophisticated output
Chapter 8: Project 2.5: Schema and Metadata
Description
Approach
Define Pydantic classes and emit the JSON Schema
Define expected data domains in JSON Schema notation
Use JSON Schema to validate intermediate files
Deliverables
Schema acceptance tests
Extended acceptance testing
Summary
Extras
Revise all previous chapter models to use Pydantic
Use the ORM layer
Chapter 9: Project 3.1: Data Cleaning Base Application
Description
User experience
Source data
Result data
Conversions and processing.
Error reports
Approach
Model module refactoring
Pydantic V2 validation
Validation function design
Incremental design
CLI application
Redirecting stdout
Deliverables
Acceptance tests
Unit tests for the model features
Application to clean data and create an NDJSON interim file
Summary
Extras
Create an output file with rejected samples
Chapter 10: Data Cleaning Features
Project 3.2: Validate and convert source fields
Description
Approach
Deliverables
Unit tests for validation functions
Project 3.3: Validate text fields (and numeric coded fields)
Description
Approach
Deliverables
Unit tests for validation functions
Project 3.4: Validate references among separate data sources
Description
Approach
Deliverables
Unit tests for data gathering and validation
Project 3.5: Standardize data to common codes and ranges
Description
Approach
Deliverables
Unit tests for standardizing functions
Acceptance test
Project 3.6: Integration to create an acquisition pipeline
Description
Multiple extractions
Approach
Consider packages to help create a pipeline
Deliverables
Acceptance test
Summary
Extras
Hypothesis testing
Rejecting bad data via filtering (instead of logging)
Disjoint subentities
Create a fan-out cleaning pipeline
Chapter 11: Project 3.7: Interim Data Persistence
Description
Overall approach
Designing idempotent operations
Deliverables
Unit test
Acceptance test
Cleaned up re-runnable application design
Summary
Extras
Using a SQL database
Persistence with NoSQL databases
Chapter 12: Project 3.8: Integrated Data Acquisition Web Service
Description
The data series resources
Creating data for download
Overall approach
OpenAPI 3 specification.
RESTful API to be queried from a notebook
A POST request starts processing
The GET request for processing status
The GET request for the results
Security considerations
Deliverables
Acceptance test cases
RESTful API app
Unit test cases
Summary
Extras
Add filtering criteria to the POST request
Split the OpenAPI specification into two parts to use REF for the output schema
Use Celery instead of concurrent.futures
Call external processing directly instead of running a subprocess
Chapter 13: Project 4.1: Visual Analysis Techniques
Description
Overall approach
General notebook organization
Python modules for summarizing
PyPlot graphics
Data frequency histograms
X-Y scatter plot
Iteration and evolution
Deliverables
Unit test
Acceptance test
Summary
Extras
Use Seaborn for plotting
Adjust color palettes to emphasize key points about the data
Chapter 14: Project 4.2: Creating Reports
Description
Slide decks and presentations
Reports
Overall approach
Preparing slides
Preparing a report
Creating technical diagrams
Deliverables
Summary
Extras
Written reports with UML diagrams
Chapter 15: Project 5.1: Modeling Base Application
Description
Approach
Designing a summary app
Describing the distribution
Use cleaned data model
Rethink the data inspection functions
Create new results model
Deliverables
Acceptance testing
Unit testing
Application secondary feature
Summary
Extras
Measures of shape
Creating PDF reports
Serving the HTML report from the data API
Chapter 16: Project 5.2: Simple Multivariate Statistics
Description
Correlation coefficient
Linear regression
Diagrams
Approach
Statistical computations
Analysis diagrams
Including diagrams in the final document.
Deliverables.

Python Real-World Projects Craft Your Python Portfolio with Deployable Applications

Ejemplares similares