Pandas in action
Of all the introductory pandas books I've read--and I did read a few--this is the best, by a mile. Erico Lendzian, idibu.com Take the next steps in your data science career! This friendly and hands-on guide shows you how to start mastering Pandas with skills you already know from spreadsheet so...
Other Authors: | |
---|---|
Format: | eBook |
Language: | Inglés |
Published: |
Shelter Island, New York :
Manning Publications Company
[2021]
|
Edition: | [First edition] |
Subjects: | |
See on Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009635335006719 |
Table of Contents:
- Intro
- Pandas in Action
- Dedication
- Copyright
- contents
- front matter
- preface
- acknowledgments
- about this book
- Who should read this book
- How this book is organized: A road map
- About the code
- liveBook discussion forum
- Other online resources
- about the author
- about the cover illustration
- Part 1. Core pandas
- 1 Introducing pandas
- 1.1 Data in the 21st century
- 1.2 Introducing pandas
- 1.2.1 Pandas vs. graphical spreadsheet applications
- 1.2.2 Pandas vs. its competitors
- 1.3 A tour of pandas
- 1.3.1 Importing a data set
- 1.3.2 Manipulating a DataFrame
- 1.3.3 Counting values in a Series
- 1.3.4 Filtering a column by one or more criteria
- 1.3.5 Grouping data
- Summary
- 2 The Series object
- 2.1 Overview of a Series
- 2.1.1 Classes and instances
- 2.1.2 Populating the Series with values
- 2.1.3 Customizing the Series index
- 2.1.4 Creating a Series with missing values
- 2.2 Creating a Series from Python objects
- 2.3 Series attributes
- 2.4 Retrieving the first and last rows
- 2.5 Mathematical operations
- 2.5.1 Statistical operations
- 2.5.2 Arithmetic operations
- 2.5.3 Broadcasting
- 2.6 Passing the Series to Python's built-in functions
- 2.7 Coding challenge
- 2.7.1 Problems
- 2.7.2 Solutions
- Summary
- 3 Series methods
- 3.1 Importing a data set with the read_csv function
- 3.2 Sorting a Series
- 3.2.1 Sorting by values with the sort_values method
- 3.2.2 Sorting by index with the sort_index method
- 3.2.3 Retrieving the smallest and largest values with the nsmallest and nlargest methods
- 3.3 Overwriting a Series with the inplace parameter
- 3.4 Counting values with the value_counts method
- 3.5 Invoking a function on every Series value with the apply method
- 3.6 Coding challenge
- 3.6.1 Problems
- 3.6.2 Solutions
- Summary
- 4 The DataFrame object.
- 4.1 Overview of a DataFrame
- 4.1.1 Creating a DataFrame from a dictionary
- 4.1.2 Creating a DataFrame from a NumPy ndarray
- 4.2 Similarities between Series and DataFrames
- 4.2.1 Importing a DataFrame with the read_csv function
- 4.2.2 Shared and exclusive attributes of Series and DataFrames
- 4.2.3 Shared methods of Series and DataFrames
- 4.3 Sorting a DataFrame
- 4.3.1 Sorting by a single column
- 4.3.2 Sorting by multiple columns
- 4.4 Sorting by index
- 4.4.1 Sorting by row index
- 4.4.2 Sorting by column index
- 4.5 Setting a new index
- 4.6 Selecting columns and rows from a DataFrame
- 4.6.1 Selecting a single column from a DataFrame
- 4.6.2 Selecting multiple columns from a DataFrame
- 4.7 Selecting rows from a DataFrame
- 4.7.1 Extracting rows by index label
- 4.7.2 Extracting rows by index position
- 4.7.3 Extracting values from specific columns
- 4.8 Extracting values from Series
- 4.9 Renaming columns or rows
- 4.10 Resetting an index
- 4.11 Coding challenge
- 4.11.1 Problems
- 4.11.2 Solutions
- Summary
- 5 Filtering a DataFrame
- 5.1 Optimizing a data set for memory use
- 5.1.1 Converting data types with the astype method
- 5.2 Filtering by a single condition
- 5.3 Filtering by multiple conditions
- 5.3.1 The AND condition
- 5.3.2 The OR condition
- 5.3.3 Inversion with ~
- 5.3.4 Methods for Booleans
- 5.4 Filtering by condition
- 5.4.1 The isin method
- 5.4.2 The between method
- 5.4.3 The isnull and notnull methods
- 5.4.4 Dealing with null values
- 5.5 Dealing with duplicates
- 5.5.1 The duplicated method
- 5.5.2 The drop_duplicates method
- 5.6 Coding challenge
- 5.6.1 Problems
- 5.6.2 Solutions
- Summary
- Part 2. Applied pandas
- 6 Working with text data
- 6.1 Letter casing and whitespace
- 6.2 String slicing
- 6.3 String slicing and character replacement.
- 6.4 Boolean methods
- 6.5 Splitting strings
- 6.6 Coding challenge
- 6.6.1 Problems
- 6.6.2 Solutions
- 6.7 A note on regular expressions
- Summary
- 7 MultiIndex DataFrames
- 7.1 The MultiIndex object
- 7.2 MultiIndex DataFrames
- 7.3 Sorting a MultiIndex
- 7.4 Selecting with a MultiIndex
- 7.4.1 Extracting one or more columns
- 7.4.2 Extracting one or more rows with loc
- 7.4.3 Extracting one or more rows with iloc
- 7.5 Cross-sections
- 7.6 Manipulating the Index
- 7.6.1 Resetting the index
- 7.6.2 Setting the index
- 7.7 Coding challenge
- 7.7.1 Problems
- 7.7.2 Solutions
- Summary
- 8 Reshaping and pivoting
- 8.1 Wide vs. narrow data
- 8.2 Creating a pivot table from a DataFrame
- 8.2.1 The pivot_table method
- 8.2.2 Additional options for pivot tables
- 8.3 Stacking and unstacking index levels
- 8.4 Melting a data set
- 8.5 Exploding a list of values
- 8.6 Coding challenge
- 8.6.1 Problems
- 8.6.2 Solutions
- Summary
- 9 The GroupBy object
- 9.1 Creating a GroupBy object from scratch
- 9.2 Creating a GroupBy object from a data set
- 9.3 Attributes and methods of a GroupBy object
- 9.4 Aggregate operations
- 9.5 Applying a custom operation to all groups
- 9.6 Grouping by multiple columns
- 9.7 Coding challenge
- 9.7.1 Problems
- 9.7.2 Solutions
- Summary
- 10 Merging, joining, and concatenating
- 10.1 Introducing the data sets
- 10.2 Concatenating data sets
- 10.3 Missing values in concatenated DataFrames
- 10.4 Left joins
- 10.5 Inner joins
- 10.6 Outer joins
- 10.7 Merging on index labels
- 10.8 Coding challenge
- 10.8.1 Problems
- 10.8.2 Solutions
- Summary
- 11 Working with dates and times
- 11.1 Introducing the Timestamp object
- 11.1.1 How Python works with datetimes
- 11.1.2 How pandas works with datetimes
- 11.2 Storing multiple timestamps in a DatetimeIndex.
- 11.3 Converting column or index values to datetimes
- 11.4 Using the DatetimeProperties object
- 11.5 Adding and subtracting durations of time
- 11.6 Date offsets
- 11.7 The Timedelta object
- 11.8 Coding challenge
- 11.8.1 Problems
- 11.8.2 Solutions
- Summary
- 12 Imports and exports
- 12.1 Reading from and writing to JSON files
- 12.1.1 Loading a JSON file Into a DataFrame
- 12.1.2 Exporting a DataFrame to a JSON file
- 12.2 Reading from and writing to CSV files
- 12.3 Reading from and writing to Excel workbooks
- 12.3.1 Installing the xlrd and openpyxl libraries in an Anaconda environment
- 12.3.2 Importing Excel workbooks
- 12.3.3 Exporting Excel workbooks
- 12.4 Coding challenge
- 12.4.1 Problems
- 12.4.2 Solutions
- Summary
- 13 Configuring pandas
- 13.1 Getting and setting pandas options
- 13.2 Precision
- 13.3 Maximum column width
- 13.4 Chop threshold
- 13.5 Option context
- Summary
- 14 Visualization
- 14.1 Installing matplotlib
- 14.2 Line charts
- 14.3 Bar graphs
- 14.4 Pie charts
- Summary
- Appendix A. Installation and setup
- A.1 The Anaconda distribution
- A.2 The macOS setup process
- A.2.1 Installing Anaconda in macOS
- A.2.2 Launching Terminal
- A.2.3 Common Terminal commands
- A.3 The Windows setup process
- A.3.1 Installing Anaconda in Windows
- A.3.2 Launching Anaconda Prompt
- A.3.3 Common Anaconda Prompt commands
- A.4 Creating a new Anaconda environment
- A.5 Anaconda Navigator
- A.6 The basics of Jupyter Notebook
- Appendix B. Python crash course
- B.1 Simple data types
- B.1.1 Numbers
- B.1.2 Strings
- B.1.3 Booleans
- B.1.4 The None object
- B.2 Operators
- B.2.1 Mathematical operators
- B.2.2 Equality and inequality operators
- B.3 Variables
- B.4 Functions
- B.4.1 Arguments and return values
- B.4.2 Custom functions
- B.5 Modules
- B.6 Classes and objects.
- B.7 Attributes and methods
- B.8 String methods
- B.9 Lists
- B.9.1 List iteration
- B.9.2 List comprehension
- B.9.3 Converting a string to a list and vice versa
- B.10 Tuples
- B.11 Dictionaries
- B.11.1 Dictionary Iteration
- B.12 Sets
- Appendix C. NumPy crash course
- C.1 Dimensions
- C.2 The ndarray object
- C.2.1 Generating a numeric range with the arange method
- C.2.2 Attributes on a ndarray object
- C.2.3 The reshape method
- C.2.4 The randint function
- C.2.5 The randn function
- C.3 The nan object
- Appendix D. Generating fake data with Faker
- D.1 Installing Faker
- D.2 Getting started with Faker
- D.3 Populating a DataFrame with fake values
- Appendix E. Regular expressions
- E.1 Introduction to Python's re module
- E.2 Metacharacters
- E.3 Advanced search patterns
- E.4 Regular expressions and pandas
- index.