Data analysis and applications 1 clustering and regression, modeling-estimating, forecasting and data mining
"This series of books collects a diverse array of work that provides the reader with theoretical and applied information on data analysis methods, models, and techniques, along with appropriate applications. Volume 1 begins with an introductory chapter by Gilbert Saporta, a leading expert in th...
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
London, England ; Hoboken, New Jersey :
ISTE
2019.
|
Edición: | 1st edition |
Colección: | Innovation, entrepreneurship and management series. Big data, artificial intelligence and data analysis set ;
2. THEi Wiley ebooks. |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630637206719 |
Tabla de Contenidos:
- Cover
- Half-Title Page
- Title Page
- Copyright Page
- Contents
- Preface
- Introduction: 50 Years of Data Analysis: From Exploratory Data Analysis to Predictive Modeling and Machine Learning
- I.1. The revolt against mathematical statistics
- I.2. EDA and unsupervised methods for dimension reduction
- I.2.1. The time of syntheses
- I.2.2. The time of clusterwise methods
- I.2.3. Extensions to new types of data
- I.2.4. Nonlinear data analysis
- I.2.5. The time of sparse methods
- I.3. Predictive modeling
- I.3.1. Paradigms and paradoxes
- I.3.2. From statistical learning theory to empirical validation
- I.3.3. Challenges
- I.4. Conclusion
- I.5. References
- PART 1: Clustering and Regression
- 1. Cluster Validation by Measurement of Clustering Characteristics Relevant to the User
- 1.1. Introduction
- 1.2. General notation
- 1.3. Aspects of cluster validity
- 1.3.1. Small within-cluster dissimilarities
- 1.3.2. Between-cluster separation
- 1.3.3. Representation of objects by centroids
- 1.3.4. Representation of dissimilarity structure by clustering
- 1.3.5. Small within-cluster gaps
- 1.3.6. Density modes and valleys
- 1.3.7. Uniform within-cluster density
- 1.3.8. Entropy
- 1.3.9. Parsimony
- 1.3.10. Similarity to homogeneous distributional shapes
- 1.3.11. Stability
- 1.3.12. Further Aspects
- 1.4. Aggregation of indexes
- 1.5. Random clusterings for calibrating indexes
- 1.5.1. Stupid K-centroids clustering
- 1.5.2. Stupid nearest neighbors clustering
- 1.5.3. Calibration
- 1.6. Examples
- 1.6.1. Artificial data set
- 1.6.2. Tetragonula bees data
- 1.7. Conclusion
- 1.8. Acknowledgment
- 1.9. References
- 2. Histogram-Based Clustering of Sensor Network Data
- 2.1. Introduction
- 2.2. Time series data stream clustering
- 2.2.1. Local clustering of histogram data.
- 2.2.2. Online proximity matrix updating
- 2.2.3. Off-line partitioning through the dynamic clustering algorithm for dissimilarity tables
- 2.3. Results on real data
- 2.4. Conclusions
- 2.5. References
- 3. The Flexible Beta Regression Model
- 3.1. Introduction
- 3.2. The FB distribution
- 3.2.1. The beta distribution
- 3.2.2. The FB distribution
- 3.2.3. Reparameterization of the FB
- 3.3. The FB regression model
- 3.4. Bayesian inference
- 3.5. Illustrative application
- 3.6. Conclusion
- 3.7. References
- 4. S-weighted Instrumental Variables
- 4.1. Summarizing the previous relevant results
- 4.2. The notations, framework, conditions and main tool
- 4.3. S-weighted estimator and its consistency
- 4.4. S-weighted instrumental variables and their consistency
- 4.5. Patterns of results of simulations
- 4.5.1. Generating the data
- 4.5.2. Reporting the results
- 4.6. Acknowledgment
- 4.7. References
- PART 2: Models and Modeling
- 5. Grouping Property and Decomposition of Explained Variance in Linear Regression
- 5.1. Introduction
- 5.2. CAR scores
- 5.2.1. Definition and estimators
- 5.2.2. Historical criticism of the CAR scores
- 5.3. Variance decomposition methods and SVD
- 5.4. Grouping property of variance decomposition methods
- 5.4.1. Analysis of grouping property for CAR scores
- 5.4.2. Demonstration with two predictors
- 5.4.3. Analysis of grouping property using SVD
- 5.4.4. Application to the diabetes data set
- 5.5. Conclusions
- 5.6. References
- 6. On GARCH Models with Temporary Structural Changes
- 6.1. Introduction
- 6.2. The model
- 6.2.1. Trend model
- 6.2.2. Intervention GARCH model
- 6.3. Identification
- 6.4. Simulation
- 6.4.1. Simulation on trend model
- 6.4.2. Simulation on intervention trend model
- 6.5. Application
- 6.6. Concluding remarks
- 6.7. References.
- 7. A Note on the Linear Approximation of TAR Models
- 7.1. Introduction
- 7.2. Linear representations and linear approximations of nonlinear models
- 7.3. Linear approximation of the TAR model
- 7.4. References
- 8. An Approximation of Social Well-Being Evaluation Using Structural Equation Modeling
- 8.1. Introduction
- 8.2. Wellness
- 8.3. Social welfare
- 8.4. Methodology
- 8.5. Results
- 8.6. Discussion
- 8.7. Conclusions
- 8.8. References
- 9. An SEM Approach to Modeling Housing Values
- 9.1. Introduction
- 9.2. Data
- 9.3. Analysis
- 9.4. Conclusions
- 9.5. References
- 10. Evaluation of Stopping Criteria for Ranks in Solving Linear Systems
- 10.1. Introduction
- 10.2. Methods
- 10.2.1. Preliminaries
- 10.2.2. Iterative methods
- 10.3. Formulation of linear systems
- 10.4. Stopping criteria
- 10.5. Numerical experimentation of stopping criteria
- 10.5.1. Convergence of stopping criterion
- 10.5.2. Quantiles
- 10.5.3. Kendall correlation coefficient as stopping criterion
- 10.6. Conclusions
- 10.7. Acknowledgments
- 10.8. References
- 11. Estimation of a Two-Variable Second- Degree Polynomial via Sampling
- 11.1. Introduction
- 11.2. Proposed method
- 11.2.1. First restriction
- 11.2.2. Second restriction
- 11.2.3. Third restriction
- 11.2.4. Fourth restriction
- 11.2.5. Fifth restriction
- 11.2.6. Coefficient estimates
- 11.3. Experimental approaches
- 11.3.1. Experiment A
- 11.3.2. Experiment B
- 11.4. Conclusions
- 11.5. References
- PART 3: Estimators, Forecasting and Data Mining
- 12. Displaying Empirical Distributions of Conditional Quantile Estimates: An Application of Symbolic Data Analysis to the Cost Allocation Problem in Agriculture
- 12.1. Conceptual framework and methodological aspects of cost allocation
- 12.2. The empirical model of specific production cost estimates.
- 12.3. The conditional quantile estimation
- 12.4. Symbolic analyses of the empirical distributions of specific costs
- 12.5. The visualization and the analysis of econometric results
- 12.6. Conclusion
- 12.7. Acknowledgments
- 12.8. References
- 13. Frost Prediction in Apple Orchards Based upon Time Series Models
- 13.1. Introduction
- 13.2. Weather database
- 13.3. ARIMA forecast model
- 13.3.1. Stationarity and differencing
- 13.3.2. Non-seasonal ARIMA models
- 13.4. Model building
- 13.4.1. ARIMA and LR models
- 13.4.2. Binary classification of the frost data
- 13.4.3. Training and test set
- 13.5. Evaluation
- 13.6. ARIMA model selection
- 13.7. Conclusions
- 13.8. Acknowledgments
- 13.9. References
- 14. Efficiency Evaluation of Multiple-Choice Questions and Exams
- 14.1. Introduction
- 14.2. Exam efficiency evaluation
- 14.2.1. Efficiency measures and efficiency weighted grades
- 14.2.2. Iterative execution
- 14.2.3. Postprocessing
- 14.3. Real-life experiments and results
- 14.4. Conclusions
- 14.5. References
- 15. Methods of Modeling and Estimation in Mortality
- 15.1. Introduction
- 15.2. The appearance of life tables
- 15.3. On the law of mortality
- 15.4. Mortality and health
- 15.5. An advanced health state function form
- 15.6. Epilogue
- 15.7. References
- 16. An Application of Data Mining Methods to the Analysis of Bank Customer Profitability and Buying Behavior
- 16.1. Introduction
- 16.2. Data set
- 16.3. Short-term forecasting of customer profitability
- 16.4. Churn prediction
- 16.5. Next-product-to-buy
- 16.6. Conclusions and future research
- 16.7. References
- List of Authors
- Index
- Other titles from iSTE in Innovation, Entrepreneurship and Management
- EULA.